@@ -23,7 +23,7 @@ The abstract from the paper is the following:
|
|||||||
|
|
||||||
Tips:
|
Tips:
|
||||||
- OPT has the same architecture as [`BartDecoder`].
|
- OPT has the same architecture as [`BartDecoder`].
|
||||||
- Contrary to GPT2, OPT adds the EOS token `</s>` to the beginning of every prompt. **Note**: Make sure to pass `use_fast=False` when loading OPT's tokenizer with [`AutoTokenizer`] to get the correct tokenizer.
|
- Contrary to GPT2, OPT adds the EOS token `</s>` to the beginning of every prompt.
|
||||||
|
|
||||||
This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ), [Younes Belkada](https://huggingface.co/ybelkada), and [Patrick Von Platen](https://huggingface.co/patrickvonplaten).
|
This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ), [Younes Belkada](https://huggingface.co/ybelkada), and [Patrick Von Platen](https://huggingface.co/patrickvonplaten).
|
||||||
The original code can be found [here](https://github.com/facebookresearch/metaseq).
|
The original code can be found [here](https://github.com/facebookresearch/metaseq).
|
||||||
|
|||||||
Reference in New Issue
Block a user