[T5] Add training documenation (#3507)

* Add clear description of how to train T5 * correct docstring in T5 * correct typo * correct docstring format * update t5 model docs * implement collins feedback * fix typo and add more explanation for sentinal tokens * delete unnecessary todos
2020-03-30 13:35:53 +02:00
parent 33ef7002e1
commit 5b44e0a31b
4 changed files with 48 additions and 32 deletions
--- a/src/transformers/modeling_bart.py
+++ b/src/transformers/modeling_bart.py
@@ -73,7 +73,7 @@ BART_INPUTS_DOCSTRING = r"""
            Mask to avoid performing attention on padding token indices in input_ids.
            Mask values selected in ``[0, 1]``:
            ``1`` for tokens that are NOT MASKED, ``0`` for MASKED tokens.
-        encoder_outputs (tuple(:obj:`tuple(torch.FloatTensor)`, `optional`, defaults to :obj:`None`):
+        encoder_outputs (:obj:`tuple(tuple(torch.FloatTensor)`, `optional`, defaults to :obj:`None`):
            Tuple consists of (`last_hidden_state`, `optional`: `hidden_states`, `optional`: `attentions`)
            `last_hidden_state` of shape :obj:`(batch_size, sequence_length, hidden_size)`, `optional`, defaults to :obj:`None`) is a sequence of hidden-states at the output of the last layer of the encoder.
            Used in the cross-attention of the decoder.