Andrei Panferov
64c05eecd6
HIGGS Quantization Support ( #34997 )
...
* higgs init
* working with crunches
* per-model workspaces
* style
* style 2
* tests and style
* higgs tests passing
* protecting torch import
* removed torch.Tensor type annotations
* torch.nn.Module inheritance fix maybe
* hide inputs inside quantizer calls
* style structure something
* Update src/transformers/quantizers/quantizer_higgs.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* reworked num_sms
* Update src/transformers/integrations/higgs.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* revamped device checks
* docstring upd
* Update src/transformers/quantizers/quantizer_higgs.py
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
* edited tests and device map assertions
* minor edits
* updated flute cuda version in docker
* Added p=1 and 2,3bit HIGGS
* flute version check update
* incorporated `modules_to_not_convert`
* less hardcoding
* Fixed comment
* Added docs
* Fixed gemma support
* example in docs
* fixed torch_dtype for HIGGS
* Update docs/source/en/quantization/higgs.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Collection link
* dequantize interface
* newer flute version, torch.compile support
* unittest message fix
* docs update compile
* isort
* ValueError instead of assert
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2024-12-23 16:54:49 +01:00
wejoncy
4e27a4009d
FEAT : Adding VPTQ quantization method to HFQuantizer ( #34770 )
...
* init vptq
* add integration
* add vptq support
fix readme
* add tests && format
* format
* address comments
* format
* format
* address comments
* format
* address comments
* remove debug code
* Revert "remove debug code"
This reverts commit ed3b3eaaba82caf58cb3aa6e865d98e49650cf66.
* fix test
---------
Co-authored-by: Yang Wang <wyatuestc@gmail.com >
2024-12-20 09:45:53 +01:00
ivarflakstad
bc6ae0d55e
Update AMD docker image (rocm 6.1) ( #35259 )
...
* Use rocm 6.3 as base amd image and add nvidia-ml-py to exclude list
* Align rocm base image with torch wheels @6.1. Seems like the most stable combo
2024-12-13 15:41:03 +01:00
Mohamed Mekkouri
f491096f7d
Fix docker CI : install autogptq from source ( #35000 )
...
* Fixed Docker
* Test ci
* Finally
* add comment
2024-11-28 16:31:36 +01:00
Mohamed Mekkouri
8f48ccf548
Fix : Add PEFT from source to CI docker ( #34969 )
...
* Docker fix peft
* Test new docker
* uncomment
2024-11-27 14:10:47 +01:00
Mohamed Mekkouri
b76a292bde
Upgrade torch version to 2.5 in dockerfile for quantization CI ( #34924 )
...
* Upgrade Torch 2.5
* uncomment
2024-11-25 17:38:20 +01:00
Benjamin Bossan
b13916c09d
[AWQ, CI] Bump AWQ version used in docker image ( #34922 )
...
The old AWQ version is failing with the latest (unreleased)
transformers, giving the error:
> ImportError: cannot import name 'shard_checkpoint' from
'transformers.modeling_utils'
This has been resolved in awq v0.2.7:
https://github.com/casper-hansen/AutoAWQ/pull/644
2024-11-25 16:49:57 +01:00
Yih-Dar
eab6c491d4
Use torch 2.5 in scheduled CI ( #34465 )
...
* torch 2.5
* try
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-10-30 14:54:10 +01:00
Yih-Dar
fc465bb196
pin tensorflow_probability<0.22 in docker files ( #34381 )
...
0.21
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-10-28 11:59:46 +01:00
Yih-Dar
f0e640adfa
Drop support for Python 3.8 ( #34314 )
...
* drop python 3.8
* update docker files
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-10-24 11:16:55 +02:00
Name
7f5088503f
removes decord ( #33987 )
...
* removes decord dependency
optimize
np
Revert "optimize"
This reverts commit faa136b51ec4ec5858e5b0ae40eb7ef89a88b475.
helpers as documentation
pydoc
missing keys
* make fixup
* require_av
---------
Co-authored-by: ad <hi@arnaudiaz.com >
2024-10-17 17:27:34 +02:00
Arthur
fa3f2db5c7
Add documentation for docker ( #33156 )
...
* initial commit
* nit
2024-10-14 11:58:45 +02:00
Marc Sun
cac4a4876b
[Quantization] Switch to optimum-quanto ( #31732 )
...
* switch to optimum-quanto rebase squach
* fix import check
* again
* test try-except
* style
2024-10-02 15:14:34 +02:00
Ita Zaporozhets
e48e5f1f13
Support reading tiktoken tokenizer.model file ( #31656 )
...
* use existing TikTokenConverter to read tiktoken tokenizer.model file
* del test file
* create titktoken integration file
* adding tiktoken llama test
* ALTNATIVE IMPLEMENTATION: supports llama 405B
* fix one char
* remove redundant line
* small fix
* rm unused import
* flag for converting from tiktokeng
* remove unneeded file
* ruff
* remove llamatiktokenconverter, stick to general converter
* tiktoken support v2
* update test
* remove stale changes
* udpate doc
* protect import
* use is_protobuf_available
* add templateprocessor in tiktokenconverter
* reverting templateprocessor from tiktoken support
* update test
* add require_tiktoken
* dev-ci
* trigger build
* trigger build again
* dev-ci
* [build-ci-image] tiktoken
* dev-ci
* dev-ci
* dev-ci
* dev-ci
* change tiktoken file name
* feedback review
* feedback rev
* applying feedback, removing tiktoken converters
* conform test
* adding docs for review
* add doc file for review
* add doc file for review
* add doc file for review
* support loading model without config.json file
* Revert "support loading model without config.json file"
This reverts commit 2753602e51c34cef2f184eb11f36d2ad1b02babb.
* remove dev var
* updating docs
* safely import protobuf
* fix protobuf import error
* fix protobuf import error
* trying isort to fix ruff error
* fix ruff error
* try to fix ruff again
* try to fix ruff again
* try to fix ruff again
* doc table of contents
* add fix for consistency.dockerfile torchaudio
* ruff
* applying feedback
* minor typo
* merging with push-ci-image
* clean up imports
* revert dockerfile consistency
2024-09-06 14:24:02 +02:00
Sai-Suraj-27
3562772969
fix: Fixed pydantic required version in dockerfiles to make it compatible with DeepSpeed ( #33105 )
...
Fixed pydantic required version in dockerfiles.
2024-08-26 17:10:36 +02:00
Joao Gante
93e0e1a852
CI: add torchvision to the consistency image ( #32941 )
2024-08-26 15:17:45 +01:00
Arthur
3bb7b05229
Update docker image building ( #32918 )
...
commit
2024-08-21 21:23:10 +02:00
Ita Zaporozhets
54b7703682
support torch-speech ( #32537 )
2024-08-19 11:26:35 +02:00
Zach Mueller
0cea2081a3
Unpin deepspeed in Docker image/tests ( #32572 )
...
Unpin deepspeed
2024-08-14 18:30:25 +01:00
Joao Gante
36fd35e1cf
Dependencies: fix typo ( #32389 )
...
deps_2
2024-08-06 12:36:33 +01:00
Joao Gante
e3d8285a84
Docker: add speech dep to the consistency docker image ( #32374 )
2024-08-01 13:46:11 +01:00
Yih-Dar
f0bc49e7f6
use torch 2.4 in 2 CI jobs ( #32302 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-07-29 22:12:21 +02:00
Joao Gante
d1a1bcf56a
Docker: TF pin on the consistency job ( #31928 )
...
* pin
* dev-ci
* dev-ci
* dev-ci
* test pushed image
2024-07-12 14:28:46 +02:00
Yih-Dar
8e3b1fef97
Remove ninja from docker image build ( #31080 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-05-28 11:36:26 +02:00
Yih-Dar
d7942d9d27
unpin uv ( #31055 )
...
[push-ci-image]
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-05-27 13:47:47 +02:00
Younes Belkada
658b849aeb
Quantization / TST: Fix remaining quantization tests ( #31000 )
...
* Fix remaining quant tests
* Update test_quanto.py
2024-05-24 14:35:59 +02:00
Yih-Dar
5855afd1f3
pin uv==0.1.45 ( #31006 )
...
* fix
* [push-ci-image]
* run with latest
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-05-24 12:00:50 +02:00
Raushan Turganbay
d583f1317b
Quantized KV Cache ( #30483 )
...
* clean-up
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fixup
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* more suggestions
* mapping if torch available
* run tests & add 'support_quantized' flag
* fix jamba test
* revert, will be fixed by another PR
* codestyle
* HQQ and versatile cache classes
* final update
* typo
* make tests happy
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-05-23 17:25:20 +05:00
Arthur
8e8786e5f0
Update build ci image [push-ci-image] ( #30933 )
...
* [build-ci-image]
* correct branch
* push ci image
* [build-ci-image]
* update scheduled as well
* [push-ci-image]
* [build-ci-image]
* [push-ci-image]
* update deps
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* oups [build-ci-image]
* [push-ci-image]
* fix
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* updated
* [build-ci-image] update tag
* [build-ci-image]
* [build-ci-image]
* fix tag
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* github name
* commit_title?
* fetch
* update
* it not found
* dev
* dev
* [push-ci-image]
* dev
* dev
* update
* dev
* dev print dev commit message dev
* dev ? dev
* dev
* dev
* dev
* dev
* [build-ci-image]
* [build-ci-image]
* [push-ci-image]
* revert unwanted
* revert convert as well
* no you are not important
* [build-ci-image]
* Update .circleci/config.yml
* pin tf probability dev
2024-05-22 10:52:59 +02:00
Younes Belkada
fce78fd0e9
FIX / Quantization: Fix Dockerfile build ( #30890 )
...
* Update Dockerfile
* Update docker/transformers-quantization-latest-gpu/Dockerfile
2024-05-20 10:08:26 +02:00
Younes Belkada
4e17e7dcf8
TST / Quantization: Reverting to torch==2.2.1 ( #30866 )
...
Reverting to 2.2.1
2024-05-16 17:30:02 +02:00
Yih-Dar
2d83324ecf
Use torch 2.3 for CI ( #30837 )
...
2.3
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-05-15 19:31:52 +02:00
Lysandre Debut
a42844955f
Loading GGUF files support ( #30391 )
...
* Adds support for loading GGUF files
Co-authored-by: Younes Belkada <younesbelkada@gmail.com >
Co-authored-by: 99991 <99991@users.noreply.github.com >
* add q2_k q3_k q5_k support from @99991
* fix tests
* Update doc
* Style
* Docs
* fix CI
* Update docs/source/en/gguf.md
* Update docs/source/en/gguf.md
* Compute merges
* change logic
* add comment for clarity
* add comment for clarity
* Update src/transformers/models/auto/tokenization_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* change logic
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* change
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/modeling_gguf_pytorch_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* put back comment
* add comment about mistral
* comments and added tests
* fix unconsistent type
* more
* fix tokenizer
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* address comments about tests and tokenizer + add added_tokens
* from_gguf -> gguf_file
* replace on docs too
---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com >
Co-authored-by: 99991 <99991@users.noreply.github.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-15 14:28:20 +02:00
fxmarty
37bba2a32d
CI: update to ROCm 6.0.2 and test MI300 ( #30266 )
...
* update to ROCm 6.0.2 and test MI300
* add callers for mi300
* update dockerfile
* fix trainer tests
* remove apex
* style
* Update tests/trainer/test_trainer_seq2seq.py
* Update tests/trainer/test_trainer_seq2seq.py
* Update tests/trainer/test_trainer_seq2seq.py
* Update tests/trainer/test_trainer_seq2seq.py
* update to torch 2.3
* add workflow dispatch target
* we may need branches: mi300-ci after all
* nit
* fix docker build
* nit
* add check runner
* remove docker-gpu
* fix issues
* fix
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-05-13 18:14:36 +02:00
Zach Mueller
5b7a225f25
Pin deepspeed ( #30701 )
...
pin ds
2024-05-07 13:45:24 -04:00
Arthur
307f632bb2
[CI update] Try to use dockers and no cache ( #29202 )
...
* change cis
* nits
* update
* minor updates
* [push-ci-image]
* nit [push-ci-image]
* nitsssss
* [build-ci-image]
* [push-ci-image]
* [push-ci-image]
* both
* [push-ci-image]
* this?
* [push-ci-image]
* pypi-kenlm needs g++
* [push-ci-image]
* nit
* more nits [push-ci-image]
* nits [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* add vision
* [push-ci-image]
* [push-ci-image]
* add new dummy file but will need to update them [push-ci-image]
* [push-ci-image]
* show package size as well
* [push-ci-image]
* potentially ignore failures
* workflow updates
* nits [push-ci-image]
* [push-ci-image]
* fix consistency
* clean nciida triton
* also show big packages [push-ci-image]
* nit
* update
* another one
* line escape?
* add accelerate [push-ci-image]
* updates [push-ci-image]
* nits to run tests, no push-ci
* try to parse skip reason to make sure nothing is skipped that should no be skippped
* nit?
* always show skipped reasons
* nits
* better parsing of the test outputs
* action="store_true",
* failure on failed
* show matched
* debug
* update short summary with skipped, failed and errors
* nits
* nits
* coolu pdates
* remove docbuilder
* fix
* always run checks
* oups
* nits
* don't error out on library printing
* non zero exi codes
* no warning
* nit
* WAT?
* format nit
* [push-ci-image]
* fail if fail is needed
* [push-ci-image]
* sound file for torch light?
* [push-ci-image]
* order is important [push-ci-image]
* [push-ci-image] reduce even further
* [push-ci-image]
* use pytest rich !
* yes [push-ci-image]
* oupsy
* bring back the full traceback, but pytest rich should help
* nit
* [push-ci-image]
* re run
* nit
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* empty push to trigger
* [push-ci-image]
* nit? [push-ci-image]
* empty
* try to install timm with no deps
* [push-ci-image]
* oups [push-ci-image]
* [push-ci-image]
* [push-ci-image] ?
* [push-ci-image] open ssh client for git checkout fast
* empty for torch light
* updates [push-ci-image]
* nit
* @v4 for checkout
* [push-ci-image]
* [push-ci-image]
* fix fetch tests with parallelism
* [push-ci-image]
* more parallelism
* nit
* more nits
* empty to re-trigger
* empty to re-trigger
* split by timing
* did not work with previous commit
* junit.xml
* no path?
* mmm this?
* junitxml format
* split by timing
* nit
* fix junit family
* now we can test if the xunit1 is compatible!
* this?
* fully list tests
* update
* update
* oups
* finally
* use classname
* remove working directory to make sure the path does not interfere
* okay no juni should have the correct path
* name split?
* sort by classname is what make most sense
* some testing
* naem
* oups
* test something fun
* autodetect
* 18?
* nit
* file size?
* uip
* 4 is best
* update to see versions
* better print
* [push-ci-image]
* [push-ci-image]
* please install the correct keras version
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* uv is fucking me up
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* nits
* [push-ci-image]
* [push-ci-image]
* install issues an pins
* tapas as well
* nits
* more paralellism
* short tb
* soundfile
* soundfile
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* oups
* [push-ci-image]
* fix some things
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* use torch-light for hub
* small git lfs for hub job
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* fix tf tapas
* [push-ci-image]
* nits
* [push-ci-image]
* don't update the test
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* no use them
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* update tf proba
* [push-ci-image]
* [push-ci-image]
* woops
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* test with built dockers
* [push-ci-image]
* skip annoying tests
* revert fix copy
* update test values
* update
* last skip and fixup
* nit
* ALL GOOOD
* quality
* Update tests/models/layoutlmv2/test_image_processing_layoutlmv2.py
* Update docker/quality.dockerfile
Co-authored-by: Lysandre Debut <hi@lysand.re >
* Update src/transformers/models/tapas/modeling_tf_tapas.py
Co-authored-by: Lysandre Debut <hi@lysand.re >
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re >
* use torch-speed
* updates
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* fuck ken-lm [push-ci-image]
* [push-ci-image]
* [push-ci-image]
---------
Co-authored-by: Lysandre Debut <hi@lysand.re >
2024-05-06 10:10:32 +02:00
mobicham
59952994c4
Add HQQ quantization support ( #29637 )
...
* update HQQ transformers integration
* push import_utils.py
* add force_hooks check in modeling_utils.py
* fix | with Optional
* force bias as param
* check bias is Tensor
* force forward for multi-gpu
* review fixes pass
* remove torch grad()
* if any key in linear_tags fix
* add cpu/disk check
* isinstance return
* add multigpu test + refactor tests
* clean hqq_utils imports in hqq.py
* clean hqq_utils imports in quantizer_hqq.py
* delete hqq_utils.py
* Delete src/transformers/utils/hqq_utils.py
* ruff init
* remove torch.float16 from __init__ in test
* refactor test
* isinstance -> type in quantizer_hqq.py
* cpu/disk device_map check in quantizer_hqq.py
* remove type(module) nn.linear check in quantizer_hqq.py
* add BaseQuantizeConfig import inside HqqConfig init
* remove hqq import in hqq.py
* remove accelerate import from test_hqq.py
* quant config.py doc update
* add hqqconfig to main_classes doc
* make style
* __init__ fix
* ruff __init__
* skip_modules list
* hqqconfig format fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* test_hqq.py remove mistral comment
* remove self.using_multi_gpu is False
* torch_dtype default val set and logger.info
* hqq.py isinstance fix
* remove torch=None
* torch_device test_hqq
* rename test_hqq
* MODEL_ID in test_hqq
* quantizer_hqq setattr fix
* quantizer_hqq typo fix
* imports quantizer_hqq.py
* isinstance quantizer_hqq
* hqq_layer.bias reformat quantizer_hqq
* Step 2 as comment in quantizer_hqq
* prepare_for_hqq_linear() comment
* keep_in_fp32_modules fix
* HqqHfQuantizer reformat
* quantization.md hqqconfig
* quantization.md model example reformat
* quantization.md # space
* quantization.md space })
* quantization.md space })
* quantization_config fix doc
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* axis value check in quantization_config
* format
* dynamic config explanation
* quant config method in quantization.md
* remove shard-level progress
* .cuda fix modeling_utils
* test_hqq fixes
* make fix-copies
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-02 17:51:49 +01:00
Younes Belkada
d179b9dc78
FIX: re-add bnb on docker image ( #30427 )
...
Update Dockerfile
2024-04-23 15:32:54 +02:00
zhong zhuang
b4c18a830a
[FEAT]: EETQ quantizer support ( #30262 )
...
* [FEAT]: EETQ quantizer support
* Update quantization.md
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update docs/source/en/quantization.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update docs/source/en/quantization.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/integrations/__init__.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/integrations/__init__.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/integrations/eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/integrations/eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/integrations/eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/auto.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/auto.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/auto.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/quantizer_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/quantizer_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update tests/quantization/eetq_integration/test_eetq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* [FEAT]: EETQ quantizer support
* [FEAT]: EETQ quantizer support
* remove whitespaces
* update quantization.md
* style
* Update docs/source/en/quantization.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* add copyright
* Update quantization.md
* Update docs/source/en/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update docs/source/en/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Address the comments by amyeroberts
* style
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Marc Sun <marc@huggingface.co >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-04-22 20:38:58 +01:00
Yih-Dar
cbc2cc187a
More fixes for doctest ( #30265 )
...
* fix
* update
* update
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-04-16 11:58:55 +02:00
Yih-Dar
4f7a9f9c5c
Fix natten install in docker ( #30161 )
...
* fix dinat in docker
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-04-10 17:45:49 +02:00
Marc Sun
58a939c6b7
Fix quantization tests ( #29914 )
...
* revert back to torch 2.1.1
* run test
* switch to torch 2.2.1
* udapte dockerfile
* fix awq tests
* fix test
* run quanto tests
* update tests
* split quantization tests
* fix
* fix again
* final fix
* fix report artifact
* build docker again
* Revert "build docker again"
This reverts commit 399a5f9d9308da071d79034f238c719de0f3532e.
* debug
* revert
* style
* new notification system
* testing notfication
* rebuild docker
* fix_prev_ci_results
* typo
* remove warning
* fix typo
* fix artifact name
* debug
* issue fixed
* debug again
* fix
* fix time
* test notif with faling test
* typo
* issues again
* final fix ?
* run all quantization tests again
* remove name to clear space
* revert modfiication done on workflow
* fix
* build docker
* build only quant docker
* fix quantization ci
* fix
* fix report
* better quantization_matrix
* add print
* revert to the basic one
2024-04-09 17:10:29 +02:00
Ilyas Moutawwakil
07d79520ef
Disable AMD memory benchmarks ( #29871 )
...
* remove py3nvml to skip amd memory benchmarks
* uninstall pynvml from docker images
2024-03-26 14:43:12 +01:00
Yih-Dar
2ddceef9a2
Fix docker image build for Latest PyTorch + TensorFlow [dev] ( #29764 )
...
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-03-21 13:14:29 +01:00
Marc Sun
28de2f4de3
[Quantization] Quanto quantizer ( #29023 )
...
* start integration
* fix
* add and debug tests
* update tests
* make pytorch serialization works
* compatible with device_map and offload
* fix tests
* make style
* add ref
* guard against safetensors
* add float8 and style
* fix is_serializable
* Fix shard_checkpoint compatibility with quanto
* more tests
* docs
* adjust memory
* better
* style
* pass tests
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* add is_safe_serialization instead
* Update src/transformers/quantizers/quantizer_quanto.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* add QbitsTensor tests
* fix tests
* simplify activation list
* Update docs/source/en/quantization.md
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com >
* better comment
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com >
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com >
* find and fix edge case
* Update docs/source/en/quantization.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* pass weights_only_kwarg instead
* fix shard_checkpoint loading
* simplify update_missing_keys
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* recursion to get all tensors
* block serialization
* skip serialization tests
* fix
* change by cuda:0 for now
* fix regression
* update device_map
* fix doc
* add noteboon
* update torch_dtype
* update doc
* typo
* typo
* remove comm
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Younes Belkada <younesbelkada@gmail.com >
2024-03-15 11:51:29 -04:00
Ilyas Moutawwakil
4fc708f98c
Exllama kernels support for AWQ models ( #28634 )
...
* added exllama kernels support for awq models
* doc
* style
* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* refactor
* moved exllama post init to after device dispatching
* bump autoawq version
* added exllama test
* style
* configurable exllama kernels
* copy exllama_config from gptq
* moved exllama version check to post init
* moved to quantization dockerfile
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2024-03-05 03:22:48 +01:00
Marc Sun
f54d82cace
[CI] Quantization workflow ( #29046 )
...
* [CI] Quantization workflow
* build dockerfile
* fix dockerfile
* update self-cheduled.yml
* test build dockerfile on push
* fix torch install
* udapte to python 3.10
* update aqlm version
* uncomment build dockerfile
* tests if the scheduler works
* fix docker
* do not trigger on psuh again
* add additional runs
* test again
* all good
* style
* Update .github/workflows/self-scheduled.yml
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* test build dockerfile with torch 2.2.0
* fix extra
* clean
* revert changes
* Revert "revert changes"
This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd.
* revert correct change
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-02-28 10:09:25 -05:00
Yih-Dar
5c341d4555
Use torch 2.2 for deepspeed CI ( #29246 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-02-27 17:51:37 +08:00
Yih-Dar
c8d98405a8
Use torch 2.2 for daily CI (model tests) ( #29208 )
...
* Use torch 2.2 for daily CI (model tests)
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-02-23 21:37:08 +08:00
Andrei Panferov
1ecf5f7c98
AQLM quantizer support ( #28928 )
...
* aqlm init
* calibration and dtypes
* docs
* Readme update
* is_aqlm_available
* Simpler link in docs
* Test TODO real reference
* init _import_structure fix
* AqlmConfig autodoc
* integration aqlm
* integrations in tests
* docstring fix
* legacy typing
* Less typings
* More kernels information
* Performance -> Accuracy
* correct tests
* remoced multi-gpu test
* Update docs/source/en/quantization.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* Update src/transformers/utils/quantization_config.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Brought back multi-gpu tests
* Update src/transformers/integrations/aqlm.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update tests/quantization/aqlm_integration/test_aqlm.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
---------
Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2024-02-14 09:25:41 +01:00