Support loading Quark quantized models in Transformers (#36372)

* add quark quantizer

* add quark doc

* clean up doc

* fix tests

* make style

* more style fixes

* cleanup imports

* cleaning

* precise install

* Update docs/source/en/quantization/quark.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/quark_integration/test_quark.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* remove import guard as suggested

* update copyright headers

* add quark to transformers-quantization-latest-gpu Dockerfile

* make tests pass on transformers main + quark==0.7

* add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Bowen Bao <bowenbao@amd.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

This commit is contained in:

fxmarty-amd

2025-03-20 15:40:51 +01:00

committed by

GitHub

parent ce091b1bda

commit 1a374799ce

15 changed files with 432 additions and 1 deletions

									
										4

docs/source/en/main_classes/quantization.md
									
												View File
												
				@@ -88,3 +88,7 @@ Learn how to quantize models in the [Quantization](../quantization) guide.

				## FineGrainedFP8Config

				[[autodoc]] FineGrainedFP8Config

				## QuarkConfig

				[[autodoc]] QuarkConfig

Support loading Quark quantized models in Transformers (#36372)

4 docs/source/en/main_classes/quantization.md Unescape Escape View File

4

docs/source/en/main_classes/quantization.md

View File