[XLNet] Fix mems behavior (#8567)

* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible
This commit is contained in:
Patrick von Platen
2020-11-25 22:54:59 +01:00
committed by GitHub
parent 369f1d77b4
commit 2a6fbe6a40
47 changed files with 259 additions and 134 deletions

View File

@@ -305,7 +305,7 @@ Language modeling is the task of fitting a model to a corpus, which can be domai
transformer-based models are trained using a variant of language modeling, e.g. BERT with masked language modeling,
GPT-2 with causal language modeling.
Language modeling can be useful outside of pre-training as well, for example to shift the model distribution to be
Language modeling can be useful outside of pretraining as well, for example to shift the model distribution to be
domain-specific: using a language model trained over a very large corpus, and then fine-tuning it to a news dataset or
on scientific papers e.g. `LysandreJik/arxiv-nlp <https://huggingface.co/lysandre/arxiv-nlp>`__.