[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
This commit is contained in:
@@ -45,7 +45,11 @@ with scale and our new "Colossal Clean Crawled Corpus", we achieve state-of-the-
|
||||
summarization, question answering, text classification, and more. To facilitate future work on transfer learning for
|
||||
NLP, we release our dataset, pre-trained models, and code.*
|
||||
|
||||
Tips:
|
||||
All checkpoints can be found on the [hub](https://huggingface.co/models?search=t5).
|
||||
|
||||
This model was contributed by [thomwolf](https://huggingface.co/thomwolf). The original code can be found [here](https://github.com/google-research/text-to-text-transfer-transformer).
|
||||
|
||||
## Usage tips
|
||||
|
||||
- T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which
|
||||
each task is converted into a text-to-text format. T5 works well on a variety of tasks out-of-the-box by prepending a
|
||||
@@ -91,12 +95,6 @@ Based on the original T5 model, Google has released some follow-up works:
|
||||
- **UMT5**: UmT5 is a multilingual T5 model trained on an improved and refreshed mC4 multilingual corpus, 29 trillion characters across 107 language, using a new sampling method, UniMax. Refer to
|
||||
the documentation of mT5 which can be found [here](umt5).
|
||||
|
||||
All checkpoints can be found on the [hub](https://huggingface.co/models?search=t5).
|
||||
|
||||
This model was contributed by [thomwolf](https://huggingface.co/thomwolf). The original code can be found [here](https://github.com/google-research/text-to-text-transfer-transformer).
|
||||
|
||||
<a id='training'></a>
|
||||
|
||||
## Training
|
||||
|
||||
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher
|
||||
@@ -249,8 +247,6 @@ batches to the longest example is not recommended on TPU as it triggers a recomp
|
||||
encountered during training thus significantly slowing down the training. only padding up to the longest example in a
|
||||
batch) leads to very slow training on TPU.
|
||||
|
||||
<a id='inference'></a>
|
||||
|
||||
## Inference
|
||||
|
||||
At inference time, it is recommended to use [`~generation.GenerationMixin.generate`]. This
|
||||
@@ -316,9 +312,6 @@ The predicted tokens will then be placed between the sentinel tokens.
|
||||
['<pad><extra_id_0> park offers<extra_id_1> the<extra_id_2> park.</s>']
|
||||
```
|
||||
|
||||
|
||||
<a id='scripts'></a>
|
||||
|
||||
## Performance
|
||||
|
||||
If you'd like a faster training and inference performance, install [apex](https://github.com/NVIDIA/apex#quick-start) and then the model will automatically use `apex.normalization.FusedRMSNorm` instead of `T5LayerNorm`. The former uses an optimized fused kernel which is several times faster than the latter.
|
||||
@@ -386,6 +379,9 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
|
||||
[[autodoc]] T5TokenizerFast
|
||||
|
||||
<frameworkcontent>
|
||||
<pt>
|
||||
|
||||
## T5Model
|
||||
|
||||
[[autodoc]] T5Model
|
||||
@@ -411,6 +407,9 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
[[autodoc]] T5ForQuestionAnswering
|
||||
- forward
|
||||
|
||||
</pt>
|
||||
<tf>
|
||||
|
||||
## TFT5Model
|
||||
|
||||
[[autodoc]] TFT5Model
|
||||
@@ -426,6 +425,9 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
[[autodoc]] TFT5EncoderModel
|
||||
- call
|
||||
|
||||
</tf>
|
||||
<jax>
|
||||
|
||||
## FlaxT5Model
|
||||
|
||||
[[autodoc]] FlaxT5Model
|
||||
@@ -444,3 +446,6 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
|
||||
|
||||
[[autodoc]] FlaxT5EncoderModel
|
||||
- __call__
|
||||
|
||||
</jax>
|
||||
</frameworkcontent>
|
||||
|
||||
Reference in New Issue
Block a user