Sylvain Gugger
f086652b16
Add option to log only once in multinode training ( #11819 )
...
* Add option to long only once in multinode training
* Use an alternate property
2021-05-25 08:03:43 -04:00
Sylvain Gugger
7eee950ac3
Re-styling in seq2seq attention ( #11613 )
2021-05-06 14:24:19 -04:00
Bhadresh Savani
1d30ec95c7
[Examples] Fixes inconsistency around eval vs val and predict vs test ( #11380 )
...
* added changes for uniformity
* modified files
* corrected typo
* fixed qa scripts
* fix typos
* fixed predict typo in qa no trainer
* fixed test file
* reverted trainer changes
* reverted trainer changes in custom exmaples
* updated readme
* added changes in deepspeed test
* added changes for predict and eval
2021-04-26 09:24:31 -07:00
Daniel Stancl
38a716cd41
TF BART models - Add cross_attentions to model output and fix cross-attention head masking ( #10699 )
...
* Add cross_attn_head_mask to BART
* Fix cross_attentions in TFBart-like models
* This commit enables returning of `cross_attentions`
for TFBart-like models
* It also fixes attention head masking in cross-attenion module
* Update TF model templates
* Fix missing , in TF model templates
* Fix typo: congig -> config
2021-04-26 14:16:21 +02:00
Daniel Stancl
e3ff165aa5
Fix cross-attention head mask for Torch encoder-decoder models ( #10605 )
...
* Fix cross-attention head mask for Torch BART models
* Fix head masking for cross-attention module for the following
models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
Pegasus
* Enable test_headmasking for M2M_100 model
* Fix cross_head_mask for FSMT, LED and T5
* This commit fixes `head_mask` for cross-attention modules
in the following models: FSMT, LED, T5
* It also contains some smaller changes in doc so that
it is be perfectly clear the shape of `cross_head_mask`
is the same as of `decoder_head_mask`
* Update template
* Fix template for BartForCausalLM
* Fix cross_head_mask for Speech2Text models
* Fix cross_head_mask in templates
* Fix args order in BartForCausalLM template
* Fix doc in BART templates
* Make more explicit naming
* `cross_head_mask` -> `cross_attn_head_mask`
* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`
* Fix doc
* make style quality
* Fix speech2text docstring
2021-04-23 18:58:06 +02:00
Sylvain Gugger
74712e22f3
Honor contributors to models ( #11329 )
...
* Honor contributors to models
* Fix typo
* Address review comments
* Add more authors
2021-04-21 09:47:27 -04:00
Sylvain Gugger
45fc8c7951
Make get_special_tokens_mask consider all tokens ( #11163 )
2021-04-09 11:57:44 -04:00
Stas Bekman
c9035e4537
fix: The 'warn' method is deprecated ( #11105 )
...
* The 'warn' method is deprecated
* fix test
2021-04-07 09:20:06 -04:00
Sylvain Gugger
acc3bd9d2a
Enforce string-formatting with f-strings ( #10980 )
...
* First third
* Styling and fix mistake
* Quality
* All the rest
* Treat %s and %d
* typo
* Missing )
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
2021-03-31 10:00:27 -04:00
Sylvain Gugger
700229f8a4
Fixes in the templates ( #10951 )
...
* Fixes in the templates
* Define in all cases
* Dimensionality -> Dimension
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
2021-03-29 17:36:13 -04:00
Bhadresh Savani
7ef40120a0
[Examples] Added predict stage and Updated Example Template ( #10868 )
...
* added predict stage
* added test keyword in exception message
* removed example specific saving predictions
* fixed f-string error
* removed extra line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
2021-03-23 10:37:59 -07:00
Sylvain Gugger
bf1f43fbd7
Update the example template for a no Trainer option ( #10865 )
2021-03-23 10:02:39 -04:00
Sylvain Gugger
2295d783d5
Copy tokenizer files in each of their repo ( #10624 )
...
* Move tokenizer files in each repo
* Fix mBART50 tests
* Fix mBART tests
* Fix Marian tests
* Update templates
2021-03-10 11:26:23 -05:00
Bhadresh Savani
ac17f71159
added max_sample args and metrics changes ( #10602 )
2021-03-09 12:06:56 -05:00
Sylvain Gugger
7da995c00c
Fix embeddings for PyTorch 1.8 ( #10549 )
...
* Fix embeddings for PyTorch 1.8
* Try with PyTorch 1.8.0
* Fix embeddings init
* Fix copies
* Typo
* More typos
2021-03-05 16:18:48 -05:00
Patrick von Platen
2d2ed2cc18
[T5] Fix speed degradation bug t5 ( #10496 )
...
* fix speed degradation bug t5
* fix for all models
* fix code quality
2021-03-03 12:42:41 +03:00
mingruimingrui
894db6701e
Bugfix: Removal of padding_idx in BartLearnedPositionalEmbedding ( #10200 )
...
* Assumption of padding_idx <2 might not stand
* Use offset instead of 2
* Fix with black
* Change behavior to warning instead for backward compatibility.
* Fix with black
* Remove warning
* Make padding_idx non-required
* padding_idx fix for blenderbot
* padding_idx fix for blenderbot_small
* padding_idx fix for led
* padding_idx fix for mbart
* Remove extra whitespaces
* padding_idx fix for template
* Fix padding_idx passed to nn.Embedding mistake
* Fixed padding_idx passed to positional embedding in template
* Remove padding_idx from pytorch learned positional embeddings
* Remove accidentally added quotes
* Remove padding_idx from tf learned positional embeddings
* Remove zeroing of weights in __init__
Co-authored-by: Wang Ming Rui <mingrui.wang@C02CJTUYMD6M.local >
2021-02-25 14:33:13 +03:00
Julien Plu
83d803ba02
Making TF BART-like models XLA and AMP compliant ( #10191 )
...
* Update BART
* Update Blenderbot
* Update BlenderbotSmall
* Update Marian
* Update MBart
* Update MBart
* Update Pegasus
* Update template
* Fix Marian and Pegasus
* Apply style
* Default initializer
* Default initializer
* Default initializer
* Remove int32 casts
* Fix template
* Remove more cast
2021-02-17 17:48:56 +01:00
Julien Plu
31b0560ab4
Add AMP for Albert ( #10141 )
2021-02-15 17:18:33 +01:00
Julien Plu
570218878a
Fix TF template ( #10189 )
...
* Fix template
* Update Seq2Seq tests
2021-02-15 09:21:57 -05:00
Patrick von Platen
8e13b73593
Update README.md
2021-02-11 18:35:27 +03:00
Patrick von Platen
d6b4f48ecb
Update ADD_BIG_BIRD.md
2021-02-11 18:34:17 +03:00
Patrick von Platen
4cda2d73ef
Update ADD_BIG_BIRD.md
2021-02-09 19:58:35 +03:00
Julien Plu
b82fe7d258
Replace strided slice with tf.expand_dims ( #10078 )
...
* Replace tf.newaxis -> tf.expand_dims
* Fix tests
* Fix tests
* Use reshape when a tensors needs a double expand
* Fix GPT2
* Fix GPT2
2021-02-09 11:48:28 -05:00
Lysandre Debut
c9df1b1d53
Model templates ( #10072 )
2021-02-08 09:07:02 -05:00
Julien Plu
cdd8659231
Fix TF template ( #10069 )
...
* Fix template
* Fix template
2021-02-08 08:10:50 -05:00
Julien Plu
31563e056d
Restore TF embeddings and attention layers to their previous version ( #9890 )
...
* Refacto BERT
* Restore all the concerned models
* Remove print
* Update template
* Apply Sylvain's and Morgan's comments
* Fix cast
* Put the cast inside call
* Remove cond in ebds
* Fix funnel
* Restore previous dot product (attention_scores) computation
* Add ConvBERT and BART
* Make all the S2S models ONNX compliant
* Fix test
* Fix check copies
2021-02-08 14:36:30 +03:00
Lysandre Debut
ae37ceacbd
Fix typo ( #10064 )
2021-02-08 06:02:05 -05:00
Stas Bekman
8ea412a86f
[examples] make run scripts executable ( #10037 )
...
* make executable
* make executable
* same for the template
* cleanup
2021-02-05 15:51:18 -08:00
Patrick von Platen
89be094e29
[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown ( #9921 )
...
* add big bird
* change teacher to mentor
* add proposal template
* adapt template
* delete old template
* correct some links
* finish template
* create big bird from template
* add big bird
* improve boxes
* finish boxes
* add pointers for BigBird
* finish big bird
* up
* up
* up
* up
* apply lysandres and sylvains suggestions
* delete bogus file
* correct markdown
* try different style
* try different style
* finalize
2021-02-05 15:47:54 +03:00
Lysandre Debut
e89c959af9
Fix model templates ( #9999 )
2021-02-04 07:47:26 -05:00
demSd
00031785a8
BartForCausalLM analogs to ProphetNetForCausalLM ( #9128 )
...
* initiliaze bart4causalLM
* create BartDecoderWrapper, setters/getters
* delete spaces
* forward and additional methods
* update cache function, loss function, remove ngram* params in data class.
* add bartcausallm, bartdecoder testing
* correct bart for causal lm
* remove at
* add mbart as well
* up
* fix typo
* up
* correct
* add pegasusforcausallm
* add blenderbotforcausallm
* add blenderbotsmallforcausallm
* add marianforcausallm
* add test for MarianForCausalLM
* add Pegasus test
* add BlenderbotSmall test
* add blenderbot test
* fix a fail
* fix an import fail
* a fix
* fix
* Update modeling_pegasus.py
* fix models
* fix inputs_embeds setting getter
* adapt tests
* correct repo utils check
* finish test improvement
* fix tf models as well
* make style
* make fix-copies
* fix copies
* run all tests
* last changes
* fix all tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2021-02-04 11:56:12 +03:00
Patrick von Platen
0f4dc5d864
fix typo in naming ( #9944 )
2021-02-02 12:22:42 +03:00
Patrick von Platen
538b3b4607
[Tokenizer Utils Base] Make pad function more flexible ( #9928 )
...
* change tokenizer requirement
* split line
* Correct typo from list to str
* improve style
* make other function pretty as well
* add comment
* correct typo
* add new test
* pass tests for tok without padding token
* Apply suggestions from code review
2021-02-02 10:35:27 +03:00
Patrick von Platen
0e3be1ac8f
Add new model docs ( #9667 )
...
* add new model logic
* fix docs
* change structure
* improve add_new_model
* push new changes
* up
* up
* correct spelling
* improve docstring
* correct line length
* update readme
* correct links
* correct typos
* only add rst file for now
* Apply suggestions from code review 1
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be >
* Apply suggestions from code review
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be >
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Stefan Schweter <stefan@schweter.it >
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be >
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com >
* finish adding all suggestions
* make style
* apply Niels feedback
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* apply sylvains suggestions
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be >
Co-authored-by: Stefan Schweter <stefan@schweter.it >
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-02-01 17:55:10 +03:00
Sylvain Gugger
d85691ac75
Doc title in the template ( #9910 )
2021-02-01 03:05:31 -05:00
Sylvain Gugger
b4e559cfa1
Deprecate model_path in Trainer.train ( #9854 )
2021-01-28 08:32:46 -05:00
Funtowicz Morgan
2ee9f9b69e
Fix computation of attention_probs when head_mask is provided. ( #9853 )
...
* Fix computation of attention_probs when head_mask is provided.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com >
* Apply changes to the template
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
2021-01-28 06:11:52 -05:00
Lysandre Debut
763ece2fea
Fix model templates ( #9842 )
2021-01-27 08:20:58 -05:00
Julien Plu
bd701ab1a0
Fix template ( #9840 )
2021-01-27 07:40:30 -05:00
Julien Plu
4adbdce5ee
Clean TF Bert ( #9788 )
...
* Start cleaning BERT
* Clean BERT and all those depends of it
* Fix attribute name
* Apply style
* Apply Sylvain's comments
* Apply Lysandre's comments
* remove unused import
2021-01-27 11:28:11 +01:00
Julien Plu
a1720694a5
Remove a TF usage warning and rework the documentation ( #9756 )
...
* Rework documentation
* Update the template
* Trigger CI
* Restore the warning but with the TF logger
* Update convbert doc
2021-01-27 10:45:42 +01:00
Sylvain Gugger
f2fabedbab
Setup logging with a stdout handler ( #9816 )
2021-01-27 03:39:11 -05:00
Lysandre
897a24c869
Fix head_mask for model templates
2021-01-26 11:02:48 +01:00
Andrea Cappelli
10e5f28212
Improve pytorch examples for fp16 ( #9796 )
...
* Pad to 8x for fp16 multiple choice example (#9752 )
* Pad to 8x for fp16 squad trainer example (#9752 )
* Pad to 8x for fp16 ner example (#9752 )
* Pad to 8x for fp16 swag example (#9752 )
* Pad to 8x for fp16 qa beam search example (#9752 )
* Pad to 8x for fp16 qa example (#9752 )
* Pad to 8x for fp16 seq2seq example (#9752 )
* Pad to 8x for fp16 glue example (#9752 )
* Pad to 8x for fp16 new ner example (#9752 )
* update script template #9752
* Update examples/multiple-choice/run_swag.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update examples/question-answering/run_qa.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update examples/question-answering/run_qa_beam_search.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* improve code quality #9752
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-01-26 04:47:07 -05:00
Sylvain Gugger
caf4abf768
Auto-resume training from checkpoint ( #9776 )
...
* Auto-resume training from checkpoint
* Update examples/text-classification/run_glue.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Roll out to other examples
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
2021-01-25 12:03:51 -05:00
Julien Plu
a7dabfb3d1
Fix TF s2s models ( #9478 )
...
* Fix Seq2Seq models for serving
* Apply style
* Fix lonfgormer
* Fix mBart/Pegasus/Blenderbot
* Apply style
* Add a main intermediate layer
* Apply style
* Remove import
* Apply tf.function to Longformer
* Fix utils check_copy
* Update S2S template
* Fix BART + Blenderbot
* Fix BlenderbotSmall
* Fix BlenderbotSmall
* Fix BlenderbotSmall
* Fix MBart
* Fix Marian
* Fix Pegasus + template
* Apply style
* Fix common attributes test
* Forgot to fix the LED test
* Apply Patrick's comment on LED Decoder
2021-01-21 17:03:29 +01:00
Julien Plu
3f290e6c84
Fix mixed precision in TF models ( #9163 )
...
* Fix Gelu precision
* Fix gelu_fast
* Naming
* Fix usage and apply style
* add TF gelu approximate version
* add TF gelu approximate version
* add TF gelu approximate version
* Apply style
* Fix albert
* Remove the usage of the Activation layer
2021-01-21 07:00:11 -05:00
Julien Plu
7251a4736d
Fix template ( #9697 )
2021-01-20 09:04:53 -05:00
Julien Plu
14042d560f
New TF embeddings (cleaner and faster) ( #9418 )
...
* Create new embeddings + add to BERT
* Add Albert
* Add DistilBert
* Add Albert + Electra + Funnel
* Add Longformer + Lxmert
* Add last models
* Apply style
* Update the template
* Remove unused imports
* Rename attribute
* Import embeddings in their own model file
* Replace word_embeddings per weight
* fix naming
* Fix Albert
* Fix Albert
* Fix Longformer
* Fix Lxmert Mobilebert and MPNet
* Fix copy
* Fix template
* Update the get weights function
* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update src/transformers/models/electra/modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* address Sylvain's comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-01-20 12:08:12 +01:00