mobicham
59952994c4
Add HQQ quantization support (#29637)
* update HQQ transformers integration
* push import_utils.py
* add force_hooks check in modeling_utils.py
* fix | with Optional
* force bias as param
* check bias is Tensor
* force forward for multi-gpu
* review fixes pass
* remove torch grad()
* if any key in linear_tags fix
* add cpu/disk check
* isinstance return
* add multigpu test + refactor tests
* clean hqq_utils imports in hqq.py
* clean hqq_utils imports in quantizer_hqq.py
* delete hqq_utils.py
* Delete src/transformers/utils/hqq_utils.py
* ruff init
* remove torch.float16 from __init__ in test
* refactor test
* isinstance -> type in quantizer_hqq.py
* cpu/disk device_map check in quantizer_hqq.py
* remove type(module) nn.linear check in quantizer_hqq.py
* add BaseQuantizeConfig import inside HqqConfig init
* remove hqq import in hqq.py
* remove accelerate import from test_hqq.py
* quant config.py doc update
* add hqqconfig to main_classes doc
* make style
* __init__ fix
* ruff __init__
* skip_modules list
* hqqconfig format fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* test_hqq.py remove mistral comment
* remove self.using_multi_gpu is False
* torch_dtype default val set and logger.info
* hqq.py isinstance fix
* remove torch=None
* torch_device test_hqq
* rename test_hqq
* MODEL_ID in test_hqq
* quantizer_hqq setattr fix
* quantizer_hqq typo fix
* imports quantizer_hqq.py
* isinstance quantizer_hqq
* hqq_layer.bias reformat quantizer_hqq
* Step 2 as comment in quantizer_hqq
* prepare_for_hqq_linear() comment
* keep_in_fp32_modules fix
* HqqHfQuantizer reformat
* quantization.md hqqconfig
* quantization.md model example reformat
* quantization.md # space
* quantization.md space })
* quantization.md space })
* quantization_config fix doc
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* axis value check in quantization_config
* format
* dynamic config explanation
* quant config method in quantization.md
* remove shard-level progress
* .cuda fix modeling_utils
* test_hqq fixes
* make fix-copies
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-02 17:51:49 +01:00
..
2024-05-02 15:20:04 +01:00
2024-05-02 17:51:49 +01:00
2024-05-02 15:30:21 +02:00
2024-05-02 14:42:25 +01:00
2024-04-08 14:21:16 +01:00
2023-11-23 15:58:21 +00:00
2024-04-25 19:38:48 +01:00
2023-09-04 11:15:12 +01:00
2024-04-24 09:38:18 +02:00
2024-04-16 15:34:04 +01:00
2024-02-08 14:13:35 -08:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2023-06-20 18:07:47 -04:00
2024-04-01 18:47:32 -07:00
2024-04-19 15:41:26 +01:00
2024-02-16 08:16:58 +01:00
2022-04-04 10:25:46 -04:00
2024-04-25 19:38:48 +01:00
2024-02-16 08:16:58 +01:00
2024-02-12 10:48:31 -08:00
2024-03-21 10:56:40 +00:00
2024-02-02 08:45:00 +01:00
2024-04-23 09:04:17 -07:00
2023-06-20 18:07:47 -04:00
2023-12-20 10:37:23 -08:00
2024-05-02 14:42:25 +01:00
2024-02-16 08:16:58 +01:00
2024-02-28 10:09:25 -05:00
2023-11-13 14:20:54 +01:00
2024-04-24 17:32:09 +02:00
2024-02-16 08:16:58 +01:00
2024-04-30 16:37:19 +01:00
2023-12-09 05:38:14 +09:00
2024-04-30 18:14:12 +01:00
2024-04-18 12:49:43 -04:00
2024-02-16 08:16:58 +01:00
2024-03-23 18:29:39 -07:00
2024-02-16 08:16:58 +01:00
2022-04-04 10:25:46 -04:00
2023-12-08 10:32:18 -08:00
2023-11-14 10:32:57 +01:00
2024-02-16 08:16:58 +01:00
2024-02-02 08:45:00 +01:00
2024-04-26 18:04:41 +01:00
2023-08-10 13:25:00 +02:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2024-03-12 10:39:56 +00:00
2024-02-16 08:16:58 +01:00
2023-06-20 18:07:47 -04:00
2023-10-31 09:44:51 -07:00
2024-02-16 08:16:58 +01:00
2023-11-06 19:45:03 +00:00
2024-04-16 11:58:55 +02:00
2024-02-16 08:16:58 +01:00
2024-02-02 08:45:00 +01:00
2024-02-16 08:16:58 +01:00
2024-05-02 17:51:49 +01:00
2024-05-02 17:51:49 +01:00
2024-04-29 10:57:51 +01:00
2023-11-06 19:45:03 +00:00
2024-02-16 08:16:58 +01:00
2024-04-16 11:58:55 +02:00
2024-02-26 08:18:15 -08:00
2024-03-28 09:42:49 +00:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2024-04-18 12:49:43 -04:00
2024-04-18 12:49:43 -04:00
2023-09-05 12:27:20 +01:00
2024-02-16 08:16:58 +01:00