HuggingFace_transformer/docs/source/en at 78d78cdf8ae0351554eaae4f528c532e3274cf50 - HuggingFace_transformer - Gitea: Git with SSUM

SUMIN/HuggingFace_transformer

Files

History

Jerry Zhang 78d78cdf8a Add TorchAOHfQuantizer (#32306 )

* Add TorchAOHfQuantizer

Summary:
Enable loading torchao quantized model in huggingface.

Test Plan:
local test

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix a few issues

* style

* Added tests and addressed some comments about dtype conversion

* fix torch_dtype warning message

* fix tests

* style

* TorchAOConfig -> TorchAoConfig

* enable offload + fix memory with multi-gpu

* update torchao version requirement to 0.4.0

* better comments

* add torch.compile to torchao README, add perf number link

---------

Co-authored-by: Marc Sun <marc@huggingface.co>

2024-08-14 16:14:24 +02:00

..

Gemma2: add cache warning (#32279 )

2024-08-07 10:03:05 +05:00

Add TorchAOHfQuantizer (#32306 )

2024-08-14 16:14:24 +02:00

"to be not" -> "not to be" (#32636 )

2024-08-12 20:20:17 +01:00

Add TorchAOHfQuantizer (#32306 )

2024-08-14 16:14:24 +02:00

[docs] Translation guide (#32547 )

2024-08-08 13:43:14 -07:00

_config.py

[#29174 ] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888 )

2024-04-08 14:21:16 +01:00

_redirects.yml

Docs / Quantization: Redirect deleted page (#31063 )

2024-05-28 18:29:22 +02:00

_toctree.yml

Add TorchAOHfQuantizer (#32306 )

2024-08-14 16:14:24 +02:00

accelerate.md

…

add_new_model.md

Remove add-new-model in favor of add-new-model-like (#30424 )

2024-04-24 09:38:18 +02:00

add_new_pipeline.md

add push_to_hub to pipeline (#29172 )

2024-04-16 15:34:04 +01:00

agents.md

Agents use grammar (#31735 )

2024-08-07 11:42:52 +02:00

attention.md

[Docs] Fix broken links and syntax issues (#28918 )

2024-02-08 14:13:35 -08:00

autoclass_tutorial.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

benchmarks.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

bertology.md

…

big_models.md

[docs] Big model loading (#29920 )

2024-04-01 18:47:32 -07:00

chat_templating.md

Cleanup tool calling documentation and rename doc (#32337 )

2024-08-12 16:20:14 +01:00

community.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

contributing.md

…

conversations.md

[docs] change temperature to a positive value (#32077 )

2024-07-23 17:47:51 +01:00

create_a_model.md

Enable HF pretrained backbones (#31145 )

2024-06-06 22:02:38 +01:00

custom_models.md

[Docs] Add language identifiers to fenced code blocks (#28955 )

2024-02-12 10:48:31 -08:00

debugging.md

[Docs] Fix spelling and grammar mistakes (#28825 )

2024-02-02 08:45:00 +01:00

deepspeed.md

Fix typos (#31819 )

2024-07-08 11:52:47 +01:00

fast_tokenizers.md

…

fsdp.md

[docs] Trainer docs (#28145 )

2023-12-20 10:37:23 -08:00

generation_strategies.md

Docs: alert for the possibility of manipulating logits (#32467 )

2024-08-07 16:34:46 +01:00

gguf.md

Add Qwen2 GGUF loading support (#31175 )

2024-06-03 14:55:10 +01:00

glossary.md

Fix typos (#31819 )

2024-07-08 11:52:47 +01:00

hpo_train.md

Remove-auth-token (#27060 )

2023-11-13 14:20:54 +01:00

index.md

Add new model (#32615 )

2024-08-12 08:22:47 +02:00

installation.md

Use HF_HUB_OFFLINE + fix has_file in offline mode (#31016 )

2024-05-29 11:55:43 +01:00

kv_cache.md

Cache: create docs (#32150 )

2024-08-06 10:24:19 +05:00

llm_optims.md

Generate: end-to-end compilation (#30788 )

2024-07-29 10:52:13 +01:00

llm_tutorial_optimization.md

Fix typos (#31819 )

2024-07-08 11:52:47 +01:00

llm_tutorial.md

Generate: update links on LLM tutorial doc (#30550 )

2024-04-30 18:14:12 +01:00

model_memory_anatomy.md

🚨🚨🚨Deprecate evaluation_strategy to eval_strategy🚨🚨🚨 (#30190 )

2024-04-18 12:49:43 -04:00

model_sharing.md

Docs: formatting nits (#32247 )

2024-07-30 15:49:14 +01:00

model_summary.md

model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702 )

2024-03-23 18:29:39 -07:00

multilingual.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

notebooks.md

…

pad_truncation.md

[Doc] Spanish translation of pad_truncation.md (#27890 )

2023-12-08 10:32:18 -08:00

peft.md

Docs / Quantization: Replace all occurences of load_in_8bit with bnb config (#31136 )

2024-05-30 16:47:35 +02:00

perf_hardware.md

Fix typos (#31819 )

2024-07-08 11:52:47 +01:00

perf_infer_cpu.md

[Docs] Fix spelling and grammar mistakes (#28825 )

2024-02-02 08:45:00 +01:00

perf_infer_gpu_one.md

Add Qwen2-Audio (#32137 )

2024-08-08 15:47:24 +02:00

perf_torch_compile.md

fix(docs): Fixed a link in docs (#32274 )

2024-07-29 10:50:43 +01:00

perf_train_cpu_many.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

perf_train_cpu.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

perf_train_gpu_many.md

Update perf_train_gpu_many.md (#31451 )

2024-06-18 11:00:26 -07:00

perf_train_gpu_one.md

Add torch_empty_cache_steps to TrainingArguments (#31546 )

2024-07-04 13:20:49 -04:00

perf_train_special.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

perf_train_tpu_tf.md

…

performance.md

[docs] Update CPU/GPU inference docs (#26881 )

2023-10-31 09:44:51 -07:00

perplexity.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

philosophy.md

[docs] fixed links with 404 (#27327 )

2023-11-06 19:45:03 +00:00

pipeline_tutorial.md

Allow FP16 or other precision inference for Pipelines (#31342 )

2024-07-05 17:21:50 +01:00

pipeline_webserver.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

pr_checks.md

[Docs] Fix spelling and grammar mistakes (#28825 )

2024-02-02 08:45:00 +01:00

preprocessing.md

chore: remove duplicate words (#31853 )

2024-07-09 10:38:29 +01:00

quicktour.md

docs: fix broken link (#31370 )

2024-06-12 11:33:00 +01:00

run_scripts.md

Fix broken link to Transformers notebooks (#30512 )

2024-04-29 10:57:51 +01:00

sagemaker.md

[docs] fixed links with 404 (#27327 )

2023-11-06 19:45:03 +00:00

serialization.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

task_summary.md

More fixes for doctest (#30265 )

2024-04-16 11:58:55 +02:00

tasks_explained.md

[docs] Spanish translation of tasks_explained.md (#29224 )

2024-02-26 08:18:15 -08:00

testing.md

Docs: Fixed WhisperModel.forward’s docstring link (#32498 )

2024-08-07 11:01:33 -07:00

tf_xla.md

fix(docs): Fixed a link in docs (#32274 )

2024-07-29 10:50:43 +01:00

tflite.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

tokenizer_summary.md

[docs] Spanish translation of tokenizer_summary.md (#31154 )

2024-06-03 16:52:23 -07:00

torchscript.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00

trainer.md

Add support for GrokAdamW optimizer (#32521 )

2024-08-13 13:20:28 +01:00

training.md

Added the necessay import of module (#30804 )

2024-05-14 18:45:06 +01:00

troubleshooting.md

Update all references to canonical models (#29001 )

2024-02-16 08:16:58 +01:00