Adding FP8 Quantization to transformers (#36026)
* first commit * adding kernels * fix create_quantized_param * fix quantization logic * end2end * fix style * fix imports * fix consistency * update * fix style * update * udpate after review * make style * update * update * fix * update * fix docstring * update * update after review * update * fix scheme * update * update * fix * update * fix docstring * add source * fix test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
This commit is contained in:
@@ -80,3 +80,7 @@ Learn how to quantize models in the [Quantization](../quantization) guide.
|
||||
## BitNetConfig
|
||||
|
||||
[[autodoc]] BitNetConfig
|
||||
|
||||
## FineGrainedFP8Config
|
||||
|
||||
[[autodoc]] FineGrainedFP8Config
|
||||
|
||||
Reference in New Issue
Block a user