Add TorchAOHfQuantizer (#32306)

* Add TorchAOHfQuantizer

Summary:
Enable loading torchao quantized model in huggingface.

Test Plan:
local test

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix a few issues

* style

* Added tests and addressed some comments about dtype conversion

* fix torch_dtype warning message

* fix tests

* style

* TorchAOConfig -> TorchAoConfig

* enable offload + fix memory with multi-gpu

* update torchao version requirement to 0.4.0

* better comments

* add torch.compile to torchao README, add perf number link

---------

Co-authored-by: Marc Sun <marc@huggingface.co>
This commit is contained in:
Jerry Zhang
2024-08-14 07:14:24 -07:00
committed by GitHub
parent 9485289f37
commit 78d78cdf8a
13 changed files with 539 additions and 3 deletions

View File

@@ -61,3 +61,7 @@ Learn how to quantize models in the [Quantization](../quantization) guide.
[[autodoc]] FbgemmFp8Config
## TorchAoConfig
[[autodoc]] TorchAoConfig