Enable doc in Spanish (#16518)
* Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates
This commit is contained in:
98
docs/source/en/model_doc/mt5.mdx
Normal file
98
docs/source/en/model_doc/mt5.mdx
Normal file
@@ -0,0 +1,98 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# mT5
|
||||
|
||||
## Overview
|
||||
|
||||
The mT5 model was presented in [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya
|
||||
Siddhant, Aditya Barua, Colin Raffel.
|
||||
|
||||
The abstract from the paper is the following:
|
||||
|
||||
*The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain
|
||||
state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a
|
||||
multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail
|
||||
the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual
|
||||
benchmarks. We also describe a simple technique to prevent "accidental translation" in the zero-shot setting, where a
|
||||
generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model
|
||||
checkpoints used in this work are publicly available.*
|
||||
|
||||
Note: mT5 was only pre-trained on [mC4](https://huggingface.co/datasets/mc4) excluding any supervised training.
|
||||
Therefore, this model has to be fine-tuned before it is useable on a downstream task, unlike the original T5 model.
|
||||
Since mT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
|
||||
fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.
|
||||
|
||||
Google has released the following variants:
|
||||
|
||||
- [google/mt5-small](https://huggingface.co/google/mt5-small)
|
||||
|
||||
- [google/mt5-base](https://huggingface.co/google/mt5-base)
|
||||
|
||||
- [google/mt5-large](https://huggingface.co/google/mt5-large)
|
||||
|
||||
- [google/mt5-xl](https://huggingface.co/google/mt5-xl)
|
||||
|
||||
- [google/mt5-xxl](https://huggingface.co/google/mt5-xxl).
|
||||
|
||||
This model was contributed by [patrickvonplaten](https://huggingface.co/patrickvonplaten). The original code can be
|
||||
found [here](https://github.com/google-research/multilingual-t5).
|
||||
|
||||
## MT5Config
|
||||
|
||||
[[autodoc]] MT5Config
|
||||
|
||||
## MT5Tokenizer
|
||||
|
||||
[[autodoc]] MT5Tokenizer
|
||||
|
||||
See [`T5Tokenizer`] for all details.
|
||||
|
||||
|
||||
## MT5TokenizerFast
|
||||
|
||||
[[autodoc]] MT5TokenizerFast
|
||||
|
||||
See [`T5TokenizerFast`] for all details.
|
||||
|
||||
|
||||
## MT5Model
|
||||
|
||||
[[autodoc]] MT5Model
|
||||
|
||||
## MT5ForConditionalGeneration
|
||||
|
||||
[[autodoc]] MT5ForConditionalGeneration
|
||||
|
||||
## MT5EncoderModel
|
||||
|
||||
[[autodoc]] MT5EncoderModel
|
||||
|
||||
## TFMT5Model
|
||||
|
||||
[[autodoc]] TFMT5Model
|
||||
|
||||
## TFMT5ForConditionalGeneration
|
||||
|
||||
[[autodoc]] TFMT5ForConditionalGeneration
|
||||
|
||||
## TFMT5EncoderModel
|
||||
|
||||
[[autodoc]] TFMT5EncoderModel
|
||||
|
||||
## FlaxMT5Model
|
||||
|
||||
[[autodoc]] FlaxMT5Model
|
||||
|
||||
## FlaxMT5ForConditionalGeneration
|
||||
|
||||
[[autodoc]] FlaxMT5ForConditionalGeneration
|
||||
Reference in New Issue
Block a user