Marc Sun
|
06a1d75bd5
|
fix gptq nits (#25500)
* fix nits
* fix docstring
* fix doc
* fix damp_percent
* fix doc
|
2023-08-14 11:43:38 -04:00 |
|
Marc Sun
|
55db70c63d
|
GPTQ integration (#25062)
* GTPQ integration
* Add tests for gptq
* support for more quantization model
* fix style
* typo
* fix method
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add dataclass and fix quantization_method
* fix doc
* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* modify dataclass
* add gtpqconfig import
* fix typo
* fix tests
* remove dataset as req arg
* remove tokenizer import
* add offload cpu quantization test
* fix check dataset
* modify dockerfile
* protect trainer
* style
* test for config
* add more log
* overwrite torch_dtype
* draft doc
* modify quantization_config docstring
* fix class name in docstring
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* more warning
* fix 8bit kwargs tests
* peft compatibility
* remove var
* fix is_gptq_quantized
* remove is_gptq_quantized
* fix wrap
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add exllama
* skip test
* overwrite float16
* style
* fix skip test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix docsting formatting
* add doc
* better test
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2023-08-10 16:06:29 -04:00 |
|
Younes Belkada
|
972fdcc778
|
[Docs/quantization] Clearer explanation on how things works under the hood. + remove outdated info (#25216)
* clearer explanation on how things works under the hood.
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add `load_in_4bit` in `from_pretrained`
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
2023-08-01 10:56:52 +02:00 |
|
Stas Bekman
|
5220606607
|
[quantization.md] fix (#25190)
Update quantization.md
|
2023-07-31 09:37:29 -07:00 |
|
Younes Belkada
|
ca974aff0f
|
[Docs] Clarify 4bit docs (#24878)
* clarify 4bit docs
* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
|
2023-07-18 13:39:08 +02:00 |
|
Marc Sun
|
35eac0df75
|
add link to accelerate doc (#24601)
|
2023-07-10 17:49:30 -04:00 |
|
Sylvain Gugger
|
eb849f6604
|
Migrate doc files to Markdown. (#24376)
* Rename index.mdx to index.md
* With saved modifs
* Address review comment
* Treat all files
* .mdx -> .md
* Remove special char
* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
|
2023-06-20 18:07:47 -04:00 |
|