From 12bb7fe77068a2a18b9c48320006dc91db4d4db0 Mon Sep 17 00:00:00 2001 From: Lorenzo Ampil Date: Tue, 28 Apr 2020 00:27:15 +0800 Subject: [PATCH] Fix t5 doc typos (#3978) * Fix tpo in into and add line under * Add missing blank line under * Correct types under --- docs/source/model_doc/t5.rst | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/source/model_doc/t5.rst b/docs/source/model_doc/t5.rst index 6ddfdbcc29..38069801fd 100644 --- a/docs/source/model_doc/t5.rst +++ b/docs/source/model_doc/t5.rst @@ -20,13 +20,14 @@ Training ~~~~~~~~~~~~~~~~~~~~ T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that for training we always need an input sequence and a target sequence. -The input sequence is fed to the model using ``input_ids``. The target sequence is shifted to the right, *i.e.* perprended by a start-sequence token and fed to the decoder using the `decoder_input_ids`. In teacher-forcing style, the target sequence is then appended by the EOS token and corresponds to the ``lm_labels``. The PAD token is hereby used as the start-sequence token. +The input sequence is fed to the model using ``input_ids``. The target sequence is shifted to the right, *i.e.* prepended by a start-sequence token and fed to the decoder using the `decoder_input_ids`. In teacher-forcing style, the target sequence is then appended by the EOS token and corresponds to the ``lm_labels``. The PAD token is hereby used as the start-sequence token. T5 can be trained / fine-tuned both in a supervised and unsupervised fashion. - Unsupervised denoising training + In this setup spans of the input sequence are masked by so-called sentinel tokens (*a.k.a* unique mask tokens) and the output sequence is formed as a concatenation of the same sentinel tokens and the *real* masked tokens. - Each sentinel tokens represents a unique mask token for this sentence and should start with ````, ````, ... up to ````. As a default 100 sentinel tokens are available in ``T5Tokenizer``. + Each sentinel token represents a unique mask token for this sentence and should start with ````, ````, ... up to ````. As a default 100 sentinel tokens are available in ``T5Tokenizer``. *E.g.* the sentence "The cute dog walks in the park" with the masks put on "cute dog" and "the" should be processed as follows: :: @@ -37,6 +38,7 @@ T5 can be trained / fine-tuned both in a supervised and unsupervised fashion. model(input_ids=input_ids, lm_labels=lm_labels) - Supervised training + In this setup the input sequence and output sequence are standard sequence to sequence input output mapping. In translation, *e.g.* the input sequence "The house is wonderful." and output sequence "Das Haus ist wunderbar." should be processed as follows: