Add TorchAOHfQuantizer (#32306)

* Add TorchAOHfQuantizer Summary: Enable loading torchao quantized model in huggingface. Test Plan: local test Reviewers: Subscribers: Tasks: Tags: * Fix a few issues * style * Added tests and addressed some comments about dtype conversion * fix torch_dtype warning message * fix tests * style * TorchAOConfig -> TorchAoConfig * enable offload + fix memory with multi-gpu * update torchao version requirement to 0.4.0 * better comments * add torch.compile to torchao README, add perf number link --------- Co-authored-by: Marc Sun <marc@huggingface.co>
2024-08-14 07:14:24 -07:00
parent 9485289f37
commit 78d78cdf8a
13 changed files with 539 additions and 3 deletions
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -163,6 +163,8 @@
    title: FBGEMM_FP8
  - local: quantization/optimum
    title: Optimum
+  - local: quantization/torchao
+    title: TorchAO
  - local: quantization/contribute
    title: Contribute new quantization method
  title: Quantization Methods