* Refactor checkpoint name in ALBERT and ALBERT_tf
* Refactor checkpoint name in BART and BART_tf
* Refactor checkpoint name in BERT generation
* Refactor checkpoint name in Blenderbot_tf
* Refactor checkpoint name in Blenderbot_small_tf
* Refactor checkpoint name in ConvBERT AND CONVBERT_TF
* Refactor checkpoint name in CTRL AND CTRL_TF
* Refactor checkpoint name in DistilBERT AND DistilBERT_TF
* Refactor checkpoint name in DistilBERT redo
* Refactor checkpoint name in Electra and Electra_tf
* Refactor checkpoint name in FlauBERT and FlauBERT_tf
* Refactor checkpoint name in FSMT
* Refactor checkpoint name in GPT2 and GPT2_tf
* Refactor checkpoint name in IBERT
* Refactor checkpoint name in LED and LED_tf
* Refactor checkpoint name in Longformer and Longformer_tf
* Refactor checkpoint name in Lxmert and Lxmert_tf
* Refactor checkpoint name in Marian_tf
* Refactor checkpoint name in MBART and MBART_tf
* Refactor checkpoint name in MobileBERT and MobileBERT_tf
* Refactor checkpoint name in mpnet and mpnet_tf
* Refactor checkpoint name in openai and openai_tf
* Refactor checkpoint name in pegasus_tf
* Refactor checkpoint name in reformer
* Refactor checkpoint name in Roberta and Roberta_tf
* Refactor checkpoint name in SqueezeBert
* Refactor checkpoint name in Transformer_xl and Transformer_xl_tf
* Refactor checkpoint name in XLM and XLM_tf
* Refactor checkpoint name in XLNET and XLNET_tf
* Refactor checkpoint name in BERT_tf
* run make tests, style, quality, fixup
* Update past_key_values in gpt2 (#9391)
* Update generation_utils, and rename some items
* Update modeling_gpt2 to avoid an error in gradient_checkpointing
* Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2
* Change the location of '_reorder_cache' in modeling files
* Add '_reorder_cache' in modeling_ctrl
* Fix a bug of my last commit in CTRL
* Add '_reorder_cache' to GPT2DoubleHeadsModel
* Manage 'use_cache' in config of test_modeling_gpt2
* Clean up the doc string
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fix the doc string (GPT-2, CTRL)
* improve gradient_checkpointing_behavior
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Resize the biases in same time than the embeddings
* Trigger CI
* Biases are not reset anymore
* Remove get_output_embeddings + better LM model detection in generation utils
* Apply style
* First test on BERT
* Update docstring + new name
* Apply the new resizing logic to all the models
* fix tests
* Apply style
* Update the template
* Fix naming
* Fix naming
* Apply style
* Apply style
* Remove unused import
* Revert get_output_embeddings
* Trigger CI
* Update num parameters
* Restore get_output_embeddings in TFPretrainedModel and add comments
* Style
* Add decoder resizing
* Style
* Fix tests
* Separate bias and decoder resize
* Fix tests
* Fix tests
* Apply style
* Add bias resizing in MPNet
* Trigger CI
* Apply style
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
* bart output hidden states upstream
* same w/ decoder
* add tests
* fix prophetnet
* fix gpt2 and ctrl
* fix fstm and skip test for reformer and longformer
* fix all models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command