Julien Plu
bf7f79cd57
Optional layers ( #8961 )
...
* Apply on BERT and ALBERT
* Update TF Bart
* Add input processing to TF BART
* Add input processing for TF CTRL
* Add input processing to TF Distilbert
* Add input processing to TF DPR
* Add input processing to TF Electra
* Add deprecated arguments
* Add input processing to TF XLM
* remove unused imports
* Add input processing to TF Funnel
* Add input processing to TF GPT2
* Add input processing to TF Longformer
* Add input processing to TF Lxmert
* Apply style
* Add input processing to TF Mobilebert
* Add input processing to TF GPT
* Add input processing to TF Roberta
* Add input processing to TF T5
* Add input processing to TF TransfoXL
* Apply style
* Rebase on master
* Fix wrong model name
* Fix BART
* Apply style
* Put the deprecated warnings in the input processing function
* Remove the unused imports
* Raise an error when len(kwargs)>0
* test ModelOutput instead of TFBaseModelOutput
* Address Patrick's comments
* Address Patrick's comments
* Add boolean processing for the inputs
* Take into account the optional layers
* Add missing/unexpected weights in the other models
* Apply style
* rename parameters
* Apply style
* Remove useless
* Remove useless
* Remove useless
* Update num parameters
* Fix tests
* Address Patrick's comment
* Remove useless attribute
2020-12-08 09:14:09 -05:00
Sylvain Gugger
00aa9dbca2
Copyright ( #8970 )
...
* Add copyright everywhere missing
* Style
2020-12-07 18:36:34 -05:00
Julien Plu
dcd3046f98
Better booleans handling in the TF models ( #8777 )
...
* Apply on BERT and ALBERT
* Update TF Bart
* Add input processing to TF BART
* Add input processing for TF CTRL
* Add input processing to TF Distilbert
* Add input processing to TF DPR
* Add input processing to TF Electra
* Add deprecated arguments
* Add input processing to TF XLM
* Add input processing to TF Funnel
* Add input processing to TF GPT2
* Add input processing to TF Longformer
* Add input processing to TF Lxmert
* Apply style
* Add input processing to TF Mobilebert
* Add input processing to TF GPT
* Add input processing to TF Roberta
* Add input processing to TF T5
* Add input processing to TF TransfoXL
* Apply style
* Rebase on master
* Bug fix
* Retry to bugfix
* Retry bug fix
* Fix wrong model name
* Try another fix
* Fix BART
* Fix input precessing
* Apply style
* Put the deprecated warnings in the input processing function
* Remove the unused imports
* Raise an error when len(kwargs)>0
* test ModelOutput instead of TFBaseModelOutput
* Bug fix
* Address Patrick's comments
* Address Patrick's comments
* Address Sylvain's comments
* Add boolean processing for the inputs
* Apply style
* Missing optional
* Fix missing some input proc
* Update the template
* Fix missing inputs
* Missing input
* Fix args parameter
* Trigger CI
* Trigger CI
* Trigger CI
* Address Patrick's and Sylvain's comments
* Replace warn by warning
* Trigger CI
* Fix XLNET
* Fix detection
2020-12-04 09:08:29 -05:00
Patrick von Platen
443f67e887
[PyTorch] Refactor Resize Token Embeddings ( #8880 )
...
* fix resize tokens
* correct mobile_bert
* move embedding fix into modeling_utils.py
* refactor
* fix lm head resize
* refactor
* break lines to make sylvain happy
* add news tests
* fix typo
* improve test
* skip bart-like for now
* check if base_model = get(...) is necessary
* clean files
* improve test
* fix tests
* revert style templates
* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py
2020-12-02 19:19:50 +01:00
Patrick von Platen
a7d46a0609
Fix dpr<>bart config for RAG ( #8808 )
...
* correct dpr test and bert pos fault
* fix dpr bert config problem
* fix layoutlm
* add config to dpr as well
2020-11-27 16:26:45 +01:00
Kristian Holsheimer
f8eda599bd
[FlaxBert] Fix non-broadcastable attention mask for batched forward-passes ( #8791 )
...
* [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes
* [FlaxRoberta] Fix non-broadcastable attention mask
* Use jax.numpy instead of ordinary numpy (otherwise not jit-able)
* Partially revert "Use jax.numpy ..."
* Add tests for batched forward passes
* Avoid unnecessary OOMs due to preallocation of GPU memory by XLA
* Auto-fix style
* Re-enable GPU memory preallocation but with mem fraction < 1/paralleism
2020-11-27 13:21:19 +01:00
Patrick von Platen
2a6fbe6a40
[XLNet] Fix mems behavior ( #8567 )
...
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
2020-11-25 16:54:59 -05:00
Julien Plu
29d4992453
New TF model inputs ( #8602 )
...
* Apply on BERT and ALBERT
* Update TF Bart
* Add input processing to TF BART
* Add input processing for TF CTRL
* Add input processing to TF Distilbert
* Add input processing to TF DPR
* Add input processing to TF Electra
* Add input processing for TF Flaubert
* Add deprecated arguments
* Add input processing to TF XLM
* remove unused imports
* Add input processing to TF Funnel
* Add input processing to TF GPT2
* Add input processing to TF Longformer
* Add input processing to TF Lxmert
* Apply style
* Add input processing to TF Mobilebert
* Add input processing to TF GPT
* Add input processing to TF Roberta
* Add input processing to TF T5
* Add input processing to TF TransfoXL
* Apply style
* Rebase on master
* Bug fix
* Retry to bugfix
* Retry bug fix
* Fix wrong model name
* Try another fix
* Fix BART
* Fix input precessing
* Apply style
* Put the deprecated warnings in the input processing function
* Remove the unused imports
* Raise an error when len(kwargs)>0
* test ModelOutput instead of TFBaseModelOutput
* Bug fix
* Address Patrick's comments
* Address Patrick's comments
* Address Sylvain's comments
* Add the new inputs in new Longformer models
* Update the template with the new input processing
* Remove useless assert
* Apply style
* Trigger CI
2020-11-24 13:55:00 -05:00
zhiheng-huang
2c83b3c38d
Support various BERT relative position embeddings (2nd) ( #8276 )
...
* Support BERT relative position embeddings
* Fix typo in README.md
* Address review comment
* Fix failing tests
* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py
* make fix copies
* fix configs of electra and albert and fix longformer
* remove copy statement from longformer
* fix albert
* fix electra
* Add bert variants forward tests for various position embeddings
* [tiny] Fix style for test_modeling_bert.py
* improve docstring
* [tiny] improve docstring and remove unnecessary dependency
* [tiny] Remove unused import
* re-add to ALBERT
* make embeddings work for ALBERT
* add test for albert
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2020-11-24 14:40:53 +01:00
Stas Bekman
e84786aaa6
consistent ignore keys + make private ( #8737 )
...
* consistent ignore keys + make private
* style
* - authorized_missing_keys => _keys_to_ignore_on_load_missing
- authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected
* move public doc of private attributes to private comment
2020-11-23 12:33:13 -08:00
Sylvain Gugger
dd52804f5f
Remove deprecated ( #8604 )
...
* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr >
* Remove needless imports
* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr >
2020-11-17 15:11:29 -05:00
Sylvain Gugger
c89bdfbe72
Reorganize repo ( #8580 )
...
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command
2020-11-16 21:43:42 -05:00