AQLM quantizer support (#28928)

* aqlm init

* calibration and dtypes

* docs

* Readme update

* is_aqlm_available

* Simpler link in docs

* Test TODO real reference

* init _import_structure fix

* AqlmConfig autodoc

* integration aqlm

* integrations in tests

* docstring fix

* legacy typing

* Less typings

* More kernels information

* Performance -> Accuracy

* correct tests

* remoced multi-gpu test

* Update docs/source/en/quantization.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Brought back multi-gpu tests

* Update src/transformers/integrations/aqlm.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/aqlm_integration/test_aqlm.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
This commit is contained in:
Andrei Panferov
2024-02-14 11:25:41 +03:00
committed by GitHub
parent 63ffd56d02
commit 1ecf5f7c98
14 changed files with 489 additions and 2 deletions

View File

@@ -26,6 +26,10 @@ Learn how to quantize models in the [Quantization](../quantization) guide.
</Tip>
## AqlmConfig
[[autodoc]] AqlmConfig
## AwqConfig
[[autodoc]] AwqConfig