Andrei Panferov
64c05eecd6
HIGGS Quantization Support (#34997)
* higgs init
* working with crunches
* per-model workspaces
* style
* style 2
* tests and style
* higgs tests passing
* protecting torch import
* removed torch.Tensor type annotations
* torch.nn.Module inheritance fix maybe
* hide inputs inside quantizer calls
* style structure something
* Update src/transformers/quantizers/quantizer_higgs.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* reworked num_sms
* Update src/transformers/integrations/higgs.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* revamped device checks
* docstring upd
* Update src/transformers/quantizers/quantizer_higgs.py
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
* edited tests and device map assertions
* minor edits
* updated flute cuda version in docker
* Added p=1 and 2,3bit HIGGS
* flute version check update
* incorporated `modules_to_not_convert`
* less hardcoding
* Fixed comment
* Added docs
* Fixed gemma support
* example in docs
* fixed torch_dtype for HIGGS
* Update docs/source/en/quantization/higgs.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Collection link
* dequantize interface
* newer flute version, torch.compile support
* unittest message fix
* docs update compile
* isort
* ValueError instead of assert
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-23 16:54:49 +01:00
..
2024-12-03 13:14:52 +01:00
2022-02-23 15:46:28 -05:00
2023-10-09 11:04:57 +02:00
2024-10-02 14:08:46 +01:00
2024-09-19 19:28:04 +01:00
2024-03-19 14:43:02 +00:00
2024-11-15 22:28:06 +01:00
2024-12-20 16:03:26 +01:00
2024-12-23 13:54:57 +01:00
2024-07-11 12:11:50 +01:00
2024-12-11 12:44:39 +01:00
2024-12-20 14:36:31 +01:00
2024-12-23 16:54:49 +01:00
2024-08-30 18:17:25 +02:00
2024-10-02 14:08:46 +01:00
2024-11-04 16:37:51 +01:00
2024-11-18 19:51:49 +01:00
2024-12-09 09:57:41 +01:00
2024-12-18 16:53:39 +01:00
2023-12-20 18:33:17 +00:00
2024-11-05 11:34:01 +01:00
2023-06-15 07:30:24 -04:00
2024-12-15 14:00:36 -05:00
2024-05-21 13:56:52 +01:00
2024-12-20 16:03:26 +01:00
2024-12-18 16:53:39 +01:00
2024-12-18 16:53:39 +01:00
2024-10-31 15:48:11 -04:00
2024-11-26 14:18:04 +00:00
2023-09-05 10:12:25 +02:00
2024-11-26 14:18:04 +00:00