Younes Belkada
|
7b139023c3
|
[AWQ ] Addresses TODO for awq tests (#27467)
addresses todo for awq tests
|
2023-11-13 18:18:41 +01:00 |
|
Younes Belkada
|
fd685cfd59
|
[Quantization] Add str to enum conversion for AWQ (#27320)
* add str to enum conversion
* fixup
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
2023-11-10 13:45:00 +01:00 |
|
Younes Belkada
|
9b25c164bd
|
[core / Quantization] Fix for 8bit serialization tests (#27234)
* fix for 8bit serialization
* added regression tests.
* fixup
|
2023-11-02 12:03:51 +01:00 |
|
Marc Sun
|
c9e72f55b2
|
Add exllamav2 better (#27111)
* add_ xllamav2 arg
* add test
* style
* add check
* add doc
* replace by use_exllama_v2
* fix tests
* fix doc
* style
* better condition
* fix logic
* add deprecate msg
* deprecate exllama
* remove disable_exllama from the linter
* remove
* fix warning
* Revert the commits deprecating exllama
* deprecate disable_exllama for use_exllama
* fix
* fix loading attribute
* better handling of args
* remove disable_exllama from init and linter
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* better arg
* fix warning
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* switch to dict
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* style
* nits
* style
* better tests
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
2023-11-01 13:09:21 -04:00 |
|
Younes Belkada
|
ae093eef01
|
[core / Quantization ] AWQ integration (#27045)
* working v1
* oops
* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* fixup
* oops
* push
* more changes
* add docs
* some fixes
* fix copies
* add v1 doc
* added installation guide
* relax constraints
* revert
* attempt llm-awq
* oops
* oops
* fixup
* raise error when incorrect cuda compute capability
* nit
* add instructions for llm-awq
* fixup
* fix copies
* fixup and docs
* change
* few changes + add demo
* add v1 tests
* add autoawq in dockerfile
* finalize
* Update tests/quantization/autoawq/test_awq.py
* fix test
* fix
* fix issue
* Update src/transformers/integrations/awq.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/integrations/awq.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/integrations/awq.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add link to example script
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add more content
* add more details
* add link to quantization docs
* camel case + change backend class name
* change to string
* fixup
* raise errors if libs not installed
* change to `bits` and `group_size`
* nit
* nit
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* disable training
* address some comments and fix nits
* fix
* final nits and fix tests
* adapt to our new runners
* make fix-copies
* Update src/transformers/utils/quantization_config.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/utils/quantization_config.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/integrations/awq.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/integrations/awq.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* move to top
* add conversion test
* final nit
* add more elaborated test
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
2023-11-01 09:06:31 +01:00 |
|
Younes Belkada
|
4bb50aa212
|
[Quantization / tests ] Fix bnb MPT test (#27178)
fix bnb mpt test
|
2023-10-31 16:25:53 +01:00 |
|
Younes Belkada
|
6b466771b0
|
[tests / Quantization] Fix bnb test (#27145)
* fix bnb test
* link to GH issue
|
2023-10-30 15:43:08 +01:00 |
|
Arthur
|
90ee9cea19
|
Revert "add exllamav2 arg" (#27102)
Revert "add exllamav2 arg (#26437)"
This reverts commit 8214d6e7b1.
|
2023-10-27 11:23:06 +02:00 |
|
Marc Sun
|
8214d6e7b1
|
add exllamav2 arg (#26437)
* add_ xllamav2 arg
* add test
* style
* add check
* add doc
* replace by use_exllama_v2
* fix tests
* fix doc
* style
* better condition
* fix logic
* add deprecate msg
|
2023-10-26 10:15:05 -04:00 |
|
Younes Belkada
|
fd6a0ade9b
|
🚨🚨🚨 [Quantization] Store the original dtype in the config as a private attribute 🚨🚨🚨 (#26761)
* First step
* fix
* add adjustements for gptq
* change to `_pre_quantization_dtype`
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix serialization
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2023-10-16 19:56:53 +02:00 |
|
Heinz-Alexander Fuetterer
|
883ed4b344
|
chore: fix typos (#26756)
|
2023-10-12 18:00:27 +02:00 |
|
Younes Belkada
|
2aef9a9601
|
[PEFT] Final fixes (#26559)
* fix issues with PEFT
* logger warning futurewarning issues
* fixup
* adapt from suggestions
* oops
* rm test
|
2023-10-03 14:53:09 +02:00 |
|
Younes Belkada
|
6824461f2a
|
[core/ auto ] Fix bnb test with code revision + bug with code revision (#26431)
* fix bnb test with code revision
* fix test
* Apply suggestions from code review
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
|
2023-10-02 11:35:07 +02:00 |
|
Younes Belkada
|
7ccac73f74
|
[RWKV] Final fix RWMV 4bit (#26134)
* Final fix RWMV 4bit
* fixup
* add a test
* add more clarifications
|
2023-09-13 16:30:20 +02:00 |
|
Younes Belkada
|
c8b26096d4
|
[core] fix 4bit num_parameters (#26132)
* fix 4bit `num_parameters`
* stronger check
|
2023-09-13 14:12:35 +02:00 |
|
Marc Sun
|
fa6107c97e
|
modify context length for GPTQ + version bump (#25899)
Release - Conda / build_and_package (push) Has been cancelled
* add new arg for gptq
* add tests
* add min version autogptq
* fix order
* skip test
* fix
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix style
* change model path
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2023-09-06 11:45:47 -04:00 |
|
Younes Belkada
|
4b79697865
|
🚨🚨🚨 [Refactor] Move third-party related utility files into integrations/ folder 🚨🚨🚨 (#25599)
* move deepspeed to `lib_integrations.deepspeed`
* more refactor
* oops
* fix slow tests
* Fix docs
* fix docs
* addess feedback
* address feedback
* final modifs for PEFT
* fixup
* ok now
* trigger CI
* trigger CI again
* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* import from `integrations`
* address feedback
* revert removal of `deepspeed` module
* revert removal of `deepspeed` module
* fix conflicts
* ooops
* oops
* add deprecation warning
* place it on the top
* put `FutureWarning`
* fix conflicts with not_doctested.txt
* add back `bitsandbytes` module with a depr warning
* fix
* fix
* fixup
* oops
* fix doctests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2023-08-25 17:13:34 +02:00 |
|
Younes Belkada
|
584eeb5387
|
[AutoGPTQ] Add correct installation of GPTQ library + fix slow tests (#25713)
* add correct installation of GPTQ library
* update tests values
|
2023-08-24 14:57:16 +02:00 |
|
Younes Belkada
|
e7e9261a20
|
[Docs] Fix un-rendered images (#25561)
fix un-rendered images
|
2023-08-17 12:08:11 +02:00 |
|
Marc Sun
|
55db70c63d
|
GPTQ integration (#25062)
* GTPQ integration
* Add tests for gptq
* support for more quantization model
* fix style
* typo
* fix method
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add dataclass and fix quantization_method
* fix doc
* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* modify dataclass
* add gtpqconfig import
* fix typo
* fix tests
* remove dataset as req arg
* remove tokenizer import
* add offload cpu quantization test
* fix check dataset
* modify dockerfile
* protect trainer
* style
* test for config
* add more log
* overwrite torch_dtype
* draft doc
* modify quantization_config docstring
* fix class name in docstring
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* more warning
* fix 8bit kwargs tests
* peft compatibility
* remove var
* fix is_gptq_quantized
* remove is_gptq_quantized
* fix wrap
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add exllama
* skip test
* overwrite float16
* style
* fix skip test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix docsting formatting
* add doc
* better test
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2023-08-10 16:06:29 -04:00 |
|