Add new quant method (#32047)

* Add new quant method

* update

* fix multi-device

* add test

* add offload

* style

* style

* add simple example

* initial doc

* docstring

* style again

* works ?

* better docs

* switch to non persistant

* remove print

* fix init

* code review
This commit is contained in:
Marc Sun
2024-07-22 20:21:59 +02:00
committed by GitHub
parent bd9dca3b85
commit 96a074fa7e
16 changed files with 770 additions and 5 deletions

View File

@@ -157,6 +157,8 @@
title: EETQ
- local: quantization/hqq
title: HQQ
- local: quantization/fbgemm_fp8
title: FBGEMM_FP8
- local: quantization/optimum
title: Optimum
- local: quantization/contribute