HuggingFace_transformer

Files

Vladislav Bronzov 9d200cfbee Add gguf support for bloom (#33473 )

* add bloom arch support for gguf

* apply format

* small refactoring, bug fix in GGUF_TENSOR_MAPPING naming

* optimize bloom GGUF_TENSOR_MAPPING

* implement reverse reshaping for bloom gguf

* add qkv weights test

* add q_8 test for bloom

2024-09-27 12:13:40 +02:00

aqlm_integration

Cache: use batch_size instead of max_batch_size (#32657 )

2024-08-16 11:48:45 +01:00

autoawq

Skip tests properly (#31308 )

2024-06-26 21:59:08 +01:00

bnb

Enable BNB multi-backend support (#31098 )

2024-09-24 03:40:56 -06:00

compressed_tensor

HFQuantizer implementation for compressed-tensors library (#31704 )

2024-09-25 14:31:38 +02:00

eetq_integration

[FEAT]: EETQ quantizer support (#30262 )

2024-04-22 20:38:58 +01:00

fbgemm_fp8

Fix FbgemmFp8Linear not preserving tensor shape (#33239 )

2024-09-11 13:26:44 +02:00

ggml

Add gguf support for bloom (#33473 )

2024-09-27 12:13:40 +02:00

gptq

🚨 Remove dataset with restrictive license (#31452 )

2024-06-17 17:56:51 +01:00

hqq

Quantization / HQQ: Fix HQQ tests on our runner (#30668 )

2024-05-06 11:33:52 +02:00

quanto_integration

Skip tests properly (#31308 )

2024-06-26 21:59:08 +01:00

torchao_integration

Add TorchAOHfQuantizer (#32306 )

2024-08-14 16:14:24 +02:00