[XLNet] Fix mems behavior (#8567)
* fix mems in xlnet * fix use_mems * fix use_mem_len * fix use mems * clean docs * fix tf typo * make xlnet tf for generation work * fix tf test * refactor use cache * add use cache for missing models * correct use_cache in generate * correct use cache in tf generate * fix tf * correct getattr typo * make sylvain happy * change in docs as well * do not apply to cookie cutter statements * fix tf test * make pytorch model fully backward compatible
This commit is contained in:
committed by
GitHub
parent
369f1d77b4
commit
2a6fbe6a40
@@ -10,7 +10,7 @@ Tasks <https://arxiv.org/abs/1907.12461>`__ by Sascha Rothe, Shashi Narayan, Ali
|
||||
|
||||
The abstract from the paper is the following:
|
||||
|
||||
*Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing. By
|
||||
*Unsupervised pretraining of large neural models has recently revolutionized Natural Language Processing. By
|
||||
warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple
|
||||
benchmarks while saving significant amounts of compute time. So far the focus has been mainly on the Natural Language
|
||||
Understanding tasks. In this paper, we demonstrate the efficacy of pre-trained checkpoints for Sequence Generation. We
|
||||
|
||||
Reference in New Issue
Block a user