HIGGS Quantization Support (#34997)

* higgs init

* working with crunches

* per-model workspaces

* style

* style 2

* tests and style

* higgs tests passing

* protecting torch import

* removed torch.Tensor type annotations

* torch.nn.Module inheritance fix maybe

* hide inputs inside quantizer calls

* style structure something

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* reworked num_sms

* Update src/transformers/integrations/higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* revamped device checks

* docstring upd

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* edited tests and device map assertions

* minor edits

* updated flute cuda version in docker

* Added p=1 and 2,3bit HIGGS

* flute version check update

* incorporated `modules_to_not_convert`

* less hardcoding

* Fixed comment

* Added docs

* Fixed gemma support

* example in docs

* fixed torch_dtype for HIGGS

* Update docs/source/en/quantization/higgs.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Collection link

* dequantize interface

* newer flute version, torch.compile support

* unittest message fix

* docs update compile

* isort

* ValueError instead of assert

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

This commit is contained in:

Andrei Panferov

2024-12-23 22:54:49 +07:00

committed by

GitHub

parent ef1f54a0a7

commit 64c05eecd6

16 changed files with 1249 additions and 0 deletions

									
										4

docs/source/en/main_classes/quantization.md
									
												View File
												
				@@ -57,6 +57,10 @@ Learn how to quantize models in the [Quantization](../quantization) guide.

				[[autodoc]] quantizers.base.HfQuantizer

				## HiggsConfig

				[[autodoc]] HiggsConfig

				## HqqConfig

				[[autodoc]] HqqConfig

HIGGS Quantization Support (#34997)

4 docs/source/en/main_classes/quantization.md Unescape Escape View File

4

docs/source/en/main_classes/quantization.md

View File