Documentation (#2989)
* All Tokenizers BertTokenizer + few fixes RobertaTokenizer OpenAIGPTTokenizer + Fixes GPT2Tokenizer + fixes TransfoXLTokenizer Correct rst for TransformerXL XLMTokenizer + fixes XLNet Tokenizer + Style DistilBERT + Fix XLNet RST CTRLTokenizer CamemBERT Tokenizer FlaubertTokenizer XLMRobertaTokenizer cleanup * cleanup
This commit is contained in:
@@ -5,7 +5,7 @@ Overview
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
OpenAI GPT-2 model was proposed in
|
||||
`Language Models are Unsupervised Multitask Learners`_
|
||||
`Language Models are Unsupervised Multitask Learners <https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf>`_
|
||||
by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
|
||||
It's a causal (unidirectional) transformer pre-trained using language modeling on a very large
|
||||
corpus of ~40 GB of text data.
|
||||
@@ -46,7 +46,7 @@ GPT2Tokenizer
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.GPT2Tokenizer
|
||||
:members:
|
||||
:members: save_vocabulary
|
||||
|
||||
|
||||
GPT2Model
|
||||
|
||||
Reference in New Issue
Block a user