HuggingFace_transformer

Files

Vladislav Bronzov 5d11de4a2f Add Qwen2Moe GGUF loading support (#33264 )

* update gguf doc, config and tensor mapping

* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests

* apply code style fixes

* reformat files

* assign GGUFQwen2Converter to qwen2_moe

2024-09-05 17:42:03 +02:00

aqlm_integration

Cache: use batch_size instead of max_batch_size (#32657 )

2024-08-16 11:48:45 +01:00

autoawq

Skip tests properly (#31308 )

2024-06-26 21:59:08 +01:00

bnb

remove to restriction for 4-bit model (#33122 )

2024-09-02 16:28:50 +02:00

eetq_integration

[FEAT]: EETQ quantizer support (#30262 )

2024-04-22 20:38:58 +01:00

fbgemm_fp8

Add new quant method (#32047 )

2024-07-22 20:21:59 +02:00

ggml

Add Qwen2Moe GGUF loading support (#33264 )

2024-09-05 17:42:03 +02:00

gptq

🚨 Remove dataset with restrictive license (#31452 )

2024-06-17 17:56:51 +01:00

hqq

Quantization / HQQ: Fix HQQ tests on our runner (#30668 )

2024-05-06 11:33:52 +02:00

quanto_integration

Skip tests properly (#31308 )

2024-06-26 21:59:08 +01:00

torchao_integration

Add TorchAOHfQuantizer (#32306 )

2024-08-14 16:14:24 +02:00