Use HF papers (#38184)
* Use hf papers * Hugging Face papers * doi to hf papers * style
This commit is contained in:
committed by
GitHub
parent
1031ed5166
commit
de24fb63ed
@@ -75,7 +75,7 @@ python run_clm_no_trainer.py \
|
||||
|
||||
### GPT-2/GPT and causal language modeling with fill-in-the middle objective
|
||||
|
||||
The following example fine-tunes GPT-2 on WikiText-2 but using the Fill-in-middle training objective. FIM objective was proposed in [Efficient Training of Language Models to Fill in the Middle](https://arxiv.org/abs/2207.14255). They showed that autoregressive language models can learn to infill text after applying a straightforward transformation to the dataset, which simply moves a span of text from the middle of a document to its end.
|
||||
The following example fine-tunes GPT-2 on WikiText-2 but using the Fill-in-middle training objective. FIM objective was proposed in [Efficient Training of Language Models to Fill in the Middle](https://huggingface.co/papers/2207.14255). They showed that autoregressive language models can learn to infill text after applying a straightforward transformation to the dataset, which simply moves a span of text from the middle of a document to its end.
|
||||
|
||||
We're using the raw WikiText-2 (no tokens were replaced before the tokenization). The loss here is that of causal language modeling.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user