Cleanup documentation for BART, Marian, MBART and Pegasus (#7523)

* Cleanup documentation for BART, Marian, MBART and Pegasus

* Cleanup documentation for BART, Marian, MBART and Pegasus
This commit is contained in:
Sylvain Gugger
2020-10-05 04:22:12 -04:00
committed by GitHub
parent 5e941bece2
commit e2c935f561
16 changed files with 594 additions and 284 deletions

View File

@@ -1,15 +1,20 @@
MBart
-----------------------------------------------------------------------------------------------------------------------
**DISCLAIMER:** If you see something strange,
file a `Github Issue <https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ and assign
**DISCLAIMER:** If you see something strange, file a `Github Issue
<https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ and assign
@sshleifer
Overview
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The MBart model was presented in `Multilingual Denoising Pre-training for Neural Machine Translation <https://arxiv.org/abs/2001.08210>`_ by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov
Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer. According to the abstract,
The MBart model was presented in `Multilingual Denoising Pre-training for Neural Machine Translation
<https://arxiv.org/abs/2001.08210>`_ by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov
Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
MBART is a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective. mBART is one of the first methods for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only on the encoder, decoder, or reconstructing parts of the text.
According to the abstract, MBART is a sequence-to-sequence denoising auto-encoder pretrained on large-scale monolingual
corpora in many languages using the BART objective. mBART is one of the first methods for pre-training a complete
sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only
on the encoder, decoder, or reconstructing parts of the text.
The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/mbart>`__
@@ -18,10 +23,11 @@ Training
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
MBart is a multilingual encoder-decoder (seq-to-seq) model primarily intended for translation task.
As the model is multilingual it expects the sequences in a different format. A special language id token
is added in both the source and target text. The source text format is ``X [eos, src_lang_code]``
where ``X`` is the source text. The target text format is ```[tgt_lang_code] X [eos]```. ```bos``` is never used.
The ```MBartTokenizer.prepare_seq2seq_batch``` handles this automatically and should be used to encode
the sequences for seq-2-seq fine-tuning.
is added in both the source and target text. The source text format is :obj:`X [eos, src_lang_code]`
where :obj:`X` is the source text. The target text format is :obj:`[tgt_lang_code] X [eos]`. :obj:`bos` is never used.
The :meth:`~transformers.MBartTokenizer.prepare_seq2seq_batch` handles this automatically and should be used to encode
the sequences for sequence-to-sequence fine-tuning.
- Supervised training
@@ -38,8 +44,8 @@ the sequences for seq-2-seq fine-tuning.
- Generation
While generating the target text set the `decoder_start_token_id` to the target language id.
The following example shows how to translate English to Romanian using the ```facebook/mbart-large-en-ro``` model.
While generating the target text set the :obj:`decoder_start_token_id` to the target language id.
The following example shows how to translate English to Romanian using the `facebook/mbart-large-en-ro` model.
.. code-block::
@@ -71,6 +77,4 @@ MBartForConditionalGeneration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.MBartForConditionalGeneration
:members: generate, forward
:members: forward