GPT Neo few fixes (#10968)
* fix checkpoint names * auto model * fix doc
This commit is contained in:
@@ -31,8 +31,8 @@ The :obj:`generate()` method can be used to generate text using GPT Neo model.
|
||||
.. code-block::
|
||||
|
||||
>>> from transformers import GPTNeoForCausalLM, GPT2Tokenizer
|
||||
>>> model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt_neo_xl")
|
||||
>>> tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt_neo_xl")
|
||||
>>> model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
|
||||
>>> tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")
|
||||
|
||||
>>> prompt = "In a shocking finding, scientists discovered a herd of unicorns living in a remote, " \
|
||||
... "previously unexplored valley, in the Andes Mountains. Even more surprising to the " \
|
||||
|
||||
@@ -139,10 +139,10 @@ For the full list, refer to `https://huggingface.co/models <https://huggingface.
|
||||
| | ``gpt2-xl`` | | 48-layer, 1600-hidden, 25-heads, 1558M parameters. |
|
||||
| | | | OpenAI's XL-sized GPT-2 English model |
|
||||
+--------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| GPTNeo | ``EleutherAI/gpt_neo_xl`` | | 24-layer, 2048-hidden, 16-heads, 1.3B parameters. |
|
||||
| GPTNeo | ``EleutherAI/gpt-neo-1.3B`` | | 24-layer, 2048-hidden, 16-heads, 1.3B parameters. |
|
||||
| | | | EleutherAI's GPT-3 like language model. |
|
||||
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| | ``EleutherAI/gpt_neo_2.7B`` | | 32-layer, 2560-hidden, 20-heads, 2.7B parameters. |
|
||||
| | ``EleutherAI/gpt-neo-2.7B`` | | 32-layer, 2560-hidden, 20-heads, 2.7B parameters. |
|
||||
| | | | EleutherAI's GPT-3 like language model. |
|
||||
+--------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| Transformer-XL | ``transfo-xl-wt103`` | | 18-layer, 1024-hidden, 16-heads, 257M parameters. |
|
||||
|
||||
Reference in New Issue
Block a user