Julien Plu
3d72d47f09
Making TF MPNet model compliant with XLA ( #10260 )
...
* Fix XLA
* Rework cast
* Apply style
2021-02-19 06:56:41 -05:00
Julien Plu
fb56bf2584
Making TF MobileBert model compliant with AMP ( #10259 )
...
* Fix AMP
* Trigger CI
* Rework cast
2021-02-19 06:55:25 -05:00
Julien Plu
2fc6284f04
Making TF Lxmert model compliant with AMP ( #10257 )
...
* Fix AMP
* Rework cast
* Apply style
2021-02-19 06:54:14 -05:00
Stas Bekman
4eddc459a9
[trainer] implement support for full fp16 in evaluation/predict ( #10268 )
...
* implement --fp16_full_eval
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* style
* add test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-02-18 17:02:35 -08:00
Stas Bekman
97e688bc22
[Trainer] memory tracker metrics ( #10225 )
...
* memory tracker metrics
* go back to eval for somewhat consistency
* handle no-gpu case
* deal with stackable eval calls
* restore callback order
* style
* simplify the API
* add test
* docs
* consistently use eval_ prefix
* improve docs
* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* rename method
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-02-18 09:27:32 -08:00
Tanmay Garg
d7f38c5d1d
Introduce warmup_ratio training argument ( #10229 )
...
Introduce warmup_ratio training argument in both
TrainingArguments and TFTrainingArguments classes (#6673 )
2021-02-18 12:23:33 -05:00
Julien Plu
2acae50a0c
Reduce the time spent for the TF slow tests ( #10152 )
...
* rework savedmodel slow test
* Improve savedmodel tests
* Remove useless content
2021-02-18 15:52:57 +01:00
Julien Plu
14ed3b978e
Fix AMP ( #10216 )
2021-02-18 06:29:43 -05:00
Julien Plu
bdf1669e3f
Making TF GPT2 compliant with XLA and AMP ( #10230 )
...
* Fix XLA and AMP
* Fix AMP and XLA
* Apply style
* Apply Patrick's comment
2021-02-18 09:36:01 +01:00
Stas Bekman
dee876ceff
[trainer] refactor place_model_on_device logic, add deepspeed ( #10243 )
...
* refactor place_model_on_device logic, add deepspeed
* doc
* style
2021-02-17 15:52:36 -08:00
Julien Plu
7246785a67
Make TF CTRL compliant with XLA and AMP ( #10209 )
...
* Fix XLA and AMP
* Apply style
* Remove useless cast
2021-02-17 18:54:15 +01:00
Julien Plu
fdb2351ebb
Making TF XLM-like models XLA and AMP compliant ( #10211 )
...
* Fix Flaubert and XLM
* Remove useless cast
* Tiny fix
* Tiny fix
2021-02-17 18:02:48 +01:00
Julien Plu
83d803ba02
Making TF BART-like models XLA and AMP compliant ( #10191 )
...
* Update BART
* Update Blenderbot
* Update BlenderbotSmall
* Update Marian
* Update MBart
* Update MBart
* Update Pegasus
* Update template
* Fix Marian and Pegasus
* Apply style
* Default initializer
* Default initializer
* Default initializer
* Remove int32 casts
* Fix template
* Remove more cast
2021-02-17 17:48:56 +01:00
Daniel Stancl
8d79e5ca49
Fix head masking for TFT5 ( #9877 )
...
* Fix head_mask and decoder_head_mask in TFT5 models
* Enable test_headmasking both fot TFT5 tester
and TFT5EncoderOnly tester
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com >
2021-02-17 19:00:09 +03:00
Lysandre Debut
4b91965731
Factor out methods ( #10215 )
2021-02-17 09:53:43 -05:00
Stas Bekman
e94d63f6cb
[trainer] fix ignored columns logger ( #10219 )
...
* [trainer] fix ignored columns logger
This PR fixes a confusing log entry that says:
```
The following columns in the evaluation set don't have a corresponding argument in `T5ForConditionalGeneration.forward` and have been ignored: .
```
when everything is in order.
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-02-16 13:35:39 -08:00
Sylvain Gugger
7169d1ea7b
Store FLOS as floats to avoid overflow. ( #10213 )
2021-02-16 11:15:15 -05:00
Julien Plu
31b0560ab4
Add AMP for Albert ( #10141 )
2021-02-15 17:18:33 +01:00
Suraj Patil
6fc940ed09
Add mBART-50 ( #10154 )
...
* add tokenizer for mBART-50
* update tokenizers
* make src_lang and tgt_lang optional
* update tokenizer test
* add setter
* update docs
* update conversion script
* update docs
* update conversion script
* update tokenizer
* update test
* update docs
* doc
* address Sylvain's suggestions
* fix test
* fix formatting
* nits
2021-02-15 20:58:54 +05:30
Suraj Patil
2a5c990038
fix RagTokenizer ( #10167 )
2021-02-15 19:48:12 +05:30
Julien Plu
c8d3fa0dfd
Check TF ops for ONNX compliance ( #10025 )
...
* Add check-ops script
* Finish to implement check_tf_ops and start the test
* Make the test mandatory only for BERT
* Update tf_ops folder
* Remove useless classes
* Add the ONNX test for GPT2 and BART
* Add a onnxruntime slow test + better opset flexibility
* Fix test + apply style
* fix tests
* Switch min opset from 12 to 10
* Update src/transformers/file_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Fix GPT2
* Remove extra shape_list usage
* Fix GPT2
* Address Morgan's comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
2021-02-15 07:55:10 -05:00
Nicolas Patry
900daec24e
Fixing NER pipeline for list inputs. ( #10184 )
...
Fixes #10168
2021-02-15 06:22:45 -05:00
Sylvain Gugger
587197dcd2
Fix datasets set_format ( #10178 )
2021-02-15 05:49:07 -05:00
Stas Bekman
8fae93ca19
[t5 tokenizer] add info logs ( #9897 )
...
* save fast tokenizer + add info logs
* fix tests
* remove the saving of fast tokenizer
2021-02-13 09:10:22 -05:00
Manuel Romero
698c9e2dbd
Fix typo in comment ( #10156 )
2021-02-13 08:26:25 -05:00
Manuel Romero
c969366870
Fix typo in comments ( #10157 )
2021-02-13 08:26:01 -05:00
Nicolas Patry
c9837a0d27
Conversion from slow to fast for BPE spm vocabs contained an error. ( #10120 )
...
* Conversion from slow to fast for BPE spm vocabs contained an error.
- There is only 1 test currently (tokenizers + slow) that used the modified path
and it's reformer, which does not contain any ids modification so the
bug was silent for now.
- The real issue is that vocab variable was overloaded by
SentencePieceExtractor, leading to Slow specific vocab oddities to be
completely ignored
- The bug was reported here https://github.com/huggingface/transformers/issues/9518
- Ran the complete tokenization test suite with slow without error
(`RUN_SLOW=1 pytest -sv tests/test_tokenization_*`)
* Remove rebase error.
* Adding the fixture.
2021-02-13 08:24:53 -05:00
Lysandre Debut
dd3a7f9641
Revert propagation ( #10171 )
2021-02-13 08:19:56 -05:00
Julien Chaumond
eed31db948
[hf_api] delete deprecated methods and tests ( #10159 )
...
* [hf_api] delete deprecated methods and tests
cc @lhoestq
* Update test_hf_api.py
2021-02-12 15:35:06 -05:00
Mohamed Al Salti
1321356bdf
Fix typo in GPT2DoubleHeadsModel docs ( #10148 )
...
* Fix typo
* apply suggestion
Co-authored-by: Suraj Patil <surajp815@gmail.com >
2021-02-12 22:48:39 +05:30
Sylvain Gugger
31245775e5
Add SageMakerTrainer for model paralellism ( #10122 )
...
* Refactor things out of main train
* Store signature
* Add SageMakerTrainer
* Init + Copyright
* Address review comments
2021-02-11 18:44:18 -05:00
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab ( #10130 )
...
* init devices/setup explicitly
* docs + test
* simplify
* cleanup
* cleanup
* cleanup
* correct the required dist setup
* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Patrick von Platen
495c157d6f
[Wav2Vec2] Improve Tokenizer & Model for batched inference ( #10117 )
...
* save intermediate
* finish batch the same as fairseq
* add normalization
* fix batched input
* add better comment
* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py
* add nice docstring
* add tokenizer tests
* make all slow tests pass
* finish PR
* correct import
2021-02-11 15:40:54 +03:00
Stas Bekman
77b862847b
[DeepSpeed] restore memory for evaluation ( #10114 )
...
* free up memory at the end of train
* rework tests
* consistent formatting
* correction
2021-02-10 09:09:48 -08:00
Suraj Patil
c130e67dce
remove adjust_logits_during_generation method ( #10087 )
...
* add forced logits processors
* delete adjust_logits method
* add forced_eos_token_id argument in config
* add tests for forced logits processors
* update gen utils tests
* add forced option to tf generate
* remove adjust_logits method from tf models
* update adjust_logits for marian
* delete _force_token_id_to_be_generated method
* style
* import warnings
* pass max_length to _get_logits_processor
* set forced_eos_token_id to None
* set forced attributes in conf utils
* typo
* fix rag generate
* add forced_eos_token_id in rag config
* remove force_bos_token_to_be_generated from BartConfig
* remove _force_token_ids_generation from FSMT
* nit
* fix negative constant
* apply suggestions from code review
2021-02-10 22:39:09 +05:30
Julien Plu
22a32cf485
Fix TF LED/Longformer attentions computation ( #10007 )
...
* Fix test
* Remove commented test
* Fix name
* Apply style
* Fix check copies
* Remove prints
* Restore boolean
* Fix reshape
2021-02-10 10:58:37 -05:00
Shiva Zamani
85395e4901
Remove speed metrics from default compute objective ( #10107 )
2021-02-09 19:03:02 -05:00
Boris Dayma
7c7962ba89
doc: update W&B related doc ( #10086 )
...
* doc: update W&B related doc
* doc(wandb): mention report_to
* doc(wandb): commit suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* doc(wandb): fix typo
* doc(wandb): remove WANDB_DISABLED
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2021-02-09 14:47:52 -05:00
Suraj Patil
3e0c62b611
[RAG] fix generate ( #10094 )
...
* fix rag generate and tests
* put back adjust_logits_during_generation
* tests are okay
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2021-02-09 21:57:38 +03:00
Patrick von Platen
226973a9c5
fix import ( #10103 )
2021-02-09 21:43:41 +03:00
Julien Plu
b82fe7d258
Replace strided slice with tf.expand_dims ( #10078 )
...
* Replace tf.newaxis -> tf.expand_dims
* Fix tests
* Fix tests
* Use reshape when a tensors needs a double expand
* Fix GPT2
* Fix GPT2
2021-02-09 11:48:28 -05:00
Daniel Stancl
e7381c4596
Add head_mask and decoder_head_mask to TF LED ( #9988 )
...
* Add head masking to TF LED
* Add head_mask to Longformer + one doc piece to LED
* Fix integration tests
2021-02-09 11:45:18 -05:00
Sylvain Gugger
77c0ce8c0c
Fix some edge cases in report_to and add deprecation warnings ( #10100 )
2021-02-09 10:38:12 -05:00
Lysandre Debut
78f4a0e7e5
Logging propagation ( #10092 )
...
* Enable propagation by default
* Document enable/disable default handler
2021-02-09 10:27:49 -05:00
Julien Plu
c6d5e56595
Fix naming ( #10095 )
2021-02-09 06:10:31 -05:00
abhishek thakur
4ed763779e
Fix example in Wav2Vec2 documentation ( #10096 )
...
* Fix example in Wav2Vec2 documentation
* fix style
2021-02-09 06:07:56 -05:00
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC ( #10089 )
...
* add wav2vec2CTC and deprecate for maskedlm
* remove from docs
2021-02-09 03:49:02 -05:00
demSd
84acf0c7bb
remove token_type_ids from TokenizerBertGeneration output ( #10070 )
2021-02-08 13:05:32 -05:00
Stas Bekman
322037e842
[trainer] deepspeed bug fixes and tests ( #10039 )
...
* deepspeed bug fixes and tests
* manual wrap?
2021-02-08 09:44:02 -08:00
Anthony MOI
f285e4c3ad
Update tokenizers requirement ( #10077 )
2021-02-08 12:27:26 -05:00