Rename BartForMaskedLM -> BartForConditionalGeneration (#3114)

* improved documentation
2020-03-05 17:41:18 -05:00
parent fa2aa699da
commit 857e0a0d3b
7 changed files with 75 additions and 71 deletions
--- a/docs/source/model_doc/bart.rst
+++ b/docs/source/model_doc/bart.rst
@@ -7,7 +7,7 @@ file a `Github Issue <https://github.com/huggingface/transformers/issues/new?ass
 Paper
 ~~~~~
 The Bart model was `proposed <https://arxiv.org/abs/1910.13461>`_ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019.
-According to the abstract:
+According to the abstract,

 - Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT).
 - The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, where spans of text are replaced with a single mask token.
@@ -18,26 +18,28 @@ The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/ma

 Implementation Notes
 ~~~~~~~~~~~~~~~~~~~~
- Bart doesn't use :obj:`token_type_ids`, for sequence classification just use BartTokenizer.encode to get the proper splitting.
- Inputs to the decoder are created by BartModel.forward if they are not passed. This is different than some other model APIs.
- Model predictions are intended to be identical to the original implementation. This only works, however, if the string you pass to fairseq.encode starts with a space.
- Decoder inputs are created automatically by the helper function ``transformers.modeling_bart._prepare_bart_decoder_inputs``
-BartModel
- ``MaskedLM.generate`` should be used for summarization, see the example in that docstrings
+- Bart doesn't use :obj:`token_type_ids` for sequence classification. Use BartTokenizer.encode to get the proper splitting.
+- The forward pass of ``BartModel`` will create decoder inputs (using the helper function ``transformers.modeling_bart._prepare_bart_decoder_inputs``)  if they are not passed. This is different than some other modeling APIs.
+- Model predictions are intended to be identical to the original implementation. This only works, however, if the string you pass to ``fairseq.encode`` starts with a space.
+- ``BartForConditionalGeneration.generate`` should be used for conditional generation tasks like summarization, see the example in that docstrings
+- Models that load the ``"bart-large-cnn"`` weights will not have a ``mask_token_id``, or be able to perform mask filling tasks.
+


 BartModel
-~~~~~~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~

 .. autoclass:: transformers.BartModel
    :members: forward

+.. autofunction:: transformers.modeling_bart._prepare_bart_decoder_inputs

-BartForMaskedLM
-~~~~~~~~~~~~~~~~~~~~~~~~~~

-.. autoclass:: transformers.BartForMaskedLM
-    :members: forward, generate
+BartForConditionalGeneration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. autoclass:: transformers.BartForConditionalGeneration
+    :members: generate, forward


 BartForSequenceClassification
@@ -52,8 +54,3 @@ BartConfig
 .. autoclass:: transformers.BartConfig
    :members:

-Automatic Creation of Decoder Inputs
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-This is enabled by default
-
-.. autofunction:: transformers.modeling_bart._prepare_bart_decoder_inputs