[docs] Increase visibility of torch_dtype="auto" (#35067)

* auto-dtype * feedback
2024-12-04 09:18:44 -08:00
parent baa3b22137
commit 1ed1de2fec
11 changed files with 49 additions and 35 deletions
--- a/docs/source/en/training.md
+++ b/docs/source/en/training.md
@@ -81,12 +81,14 @@ just use the button at the top-right of that framework's block!

 🤗 Transformers provides a [`Trainer`] class optimized for training 🤗 Transformers models, making it easier to start training without manually writing your own training loop. The [`Trainer`] API supports a wide range of training options and features such as logging, gradient accumulation, and mixed precision.

-Start by loading your model and specify the number of expected labels. From the Yelp Review [dataset card](https://huggingface.co/datasets/yelp_review_full#data-fields), you know there are five labels:
+Start by loading your model and specify the number of expected labels. From the Yelp Review [dataset card](https://huggingface.co/datasets/yelp_review_full#data-fields), you know there are five labels.
+
+By default, the weights are loaded in full precision (torch.float32) regardless of the actual data type the weights are stored in such as torch.float16. Set `torch_dtype="auto"` to load the weights in the data type defined in a model's `config.json` file to automatically load the most memory-optimal data type.

 ```py
 >>> from transformers import AutoModelForSequenceClassification

->>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", num_labels=5)
+>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", num_labels=5, torch_dtype="auto")
 ```

 <Tip>