Doc styling (#8067)

* Important files * Styling them all * Revert "Styling them all" This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy
2020-10-26 18:26:02 -04:00
parent 04a17f8550
commit 08f534d2da
271 changed files with 9726 additions and 8991 deletions
--- a/docs/source/model_doc/gpt.rst
+++ b/docs/source/model_doc/gpt.rst
@@ -6,44 +6,39 @@ Overview

 OpenAI GPT model was proposed in `Improving Language Understanding by Generative Pre-Training
 <https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf>`__
-by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever. It's a causal (unidirectional)
-transformer pre-trained using language modeling on a large corpus will long range dependencies, the Toronto Book
-Corpus.
+by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever. It's a causal (unidirectional) transformer
+pre-trained using language modeling on a large corpus will long range dependencies, the Toronto Book Corpus.

 The abstract from the paper is the following:

-*Natural language understanding comprises a wide range of diverse tasks such
-as textual entailment, question answering, semantic similarity assessment, and
-document classification. Although large unlabeled text corpora are abundant,
-labeled data for learning these specific tasks is scarce, making it challenging for
-discriminatively trained models to perform adequately. We demonstrate that large
-gains on these tasks can be realized by generative pre-training of a language model
-on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each
-specific task. In contrast to previous approaches, we make use of task-aware input
-transformations during fine-tuning to achieve effective transfer while requiring
-minimal changes to the model architecture. We demonstrate the effectiveness of
-our approach on a wide range of benchmarks for natural language understanding.
-Our general task-agnostic model outperforms discriminatively trained models that
-use architectures specifically crafted for each task, significantly improving upon the
-state of the art in 9 out of the 12 tasks studied.*
+*Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering,
+semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant,
+labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to
+perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a
+language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In
+contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve
+effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our
+approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms
+discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon
+the state of the art in 9 out of the 12 tasks studied.*

 Tips:

- GPT is a model with absolute position embeddings so it's usually advised to pad the inputs on
-  the right rather than the left.
+- GPT is a model with absolute position embeddings so it's usually advised to pad the inputs on the right rather than
+  the left.
 - GPT was trained with a causal language modeling (CLM) objective and is therefore powerful at predicting the next
-  token in a sequence. Leveraging this feature allows GPT-2 to generate syntactically coherent text as
-  it can be observed in the `run_generation.py` example script.
+  token in a sequence. Leveraging this feature allows GPT-2 to generate syntactically coherent text as it can be
+  observed in the `run_generation.py` example script.

-`Write With Transformer <https://transformer.huggingface.co/doc/gpt>`__ is a webapp created and hosted by
-Hugging Face showcasing the generative capabilities of several models. GPT is one of them.
+`Write With Transformer <https://transformer.huggingface.co/doc/gpt>`__ is a webapp created and hosted by Hugging Face
+showcasing the generative capabilities of several models. GPT is one of them.

 The original code can be found `here <https://github.com/openai/finetune-transformer-lm>`__.

 Note:

-If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install 
-``ftfy`` and ``SpaCy``::
+If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install ``ftfy``
+and ``SpaCy``::

 .. code-block:: bash

@@ -51,8 +46,7 @@ If you want to reproduce the original tokenization process of the `OpenAI GPT` p
    python -m spacy download en

 If you don't install ``ftfy`` and ``SpaCy``, the :class:`~transformers.OpenAIGPTTokenizer` will default to tokenize
-using BERT's :obj:`BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't 
-worry).
+using BERT's :obj:`BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't worry).

 OpenAIGPTConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~