Commit Graph

949 Commits

Author SHA1 Message Date
Suraj Patil
a442f87adc add LongformerTokenizerFast in AutoTokenizer (#6463) 2020-08-13 12:06:43 -04:00
Lysandre Debut
f7cbc13db7 Test model outputs equivalence (#6445)
* Test model outputs equivalence

* Fix failing tests

* From dict to kwargs

* DistilBERT

* Addressing @sgugger and @patrickvonplaten's comments
2020-08-13 11:59:35 -04:00
Prajjwal Bhargava
54c687e97c typo fix (#6462) 2020-08-13 09:36:48 -04:00
Zhu Baohe
9d94aecd51 Fix docs and bad word tokens generation_utils.py (#6387)
* fix

* fix2

* fix3
2020-08-13 13:12:16 +02:00
Joe Davison
bc820476a5 add targets arg to fill-mask pipeline (#6239)
* add targets arg to fill-mask pipeline

* add tests and more error handling

* quality

* update docstring
2020-08-12 12:48:29 -04:00
Patrick von Platen
0735def8e1 [EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411)
* add encoder-decoder for roberta

* fix headmask

* apply Sylvains suggestions

* fix typo

* Apply suggestions from code review
2020-08-12 18:23:30 +02:00
Sylvain Gugger
d2370e1bd8 Adding PaddingDataCollator (#6442)
* Data collator with padding

* Add type annotation

* Support tensors as well

* Add comment

* Fix for labels wrong shape

* Data collator with padding

* Add type annotation

* Support tensors as well

* Add comment

* Fix for labels wrong shape

* Remove changes rendered unnecessary
2020-08-12 11:32:27 -04:00
Sylvain Gugger
96c3329f19 Fix #6428 (#6437) 2020-08-12 08:47:30 -04:00
Sylvain Gugger
34fabe1697 Move prediction_loss_only to TrainingArguments (#6426) 2020-08-12 08:03:45 -04:00
Sylvain Gugger
e9c3031463 Fixes to make life easier with the nlp library (#6423)
* allow using tokenizer.pad as a collate_fn in pytorch

* allow using tokenizer.pad as a collate_fn in pytorch

* Add documentation and tests

* Make attention mask the right shape

* Better test

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-08-12 08:00:56 -04:00
Jared T Nielsen
ac5bcf236e Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttent… (#4323)
* Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttention into two separate dropout layers.

* Same dropout fixes for PyTorch.
2020-08-12 07:52:42 -04:00
Stas Bekman
ece0903e11 lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361)
* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* [model_cards] electra-base-turkish-cased-ner (#6350)

* for electra-base-turkish-cased-ner

* Add metadata

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Temporarily de-activate TPU CI

* Update modeling_tf_utils.py (#6372)

fix typo: ckeckpoint->checkpoint

* the test now works again (#6371)

* correct pl link in readme (#6364)

* refactor almost identical tests (#6339)

* refactor almost identical tests

* important to add a clear assert error message

* make the assert error even more descriptive than the original bt

* Small docfile fixes (#6328)

* Patch models (#6326)

* TFAlbertFor{TokenClassification, MultipleChoice}

* Patch models

* BERT and TF BERT info


s

* Update check_repo

* Ci GitHub caching (#6382)

* Cache Github Actions CI

* Remove useless file

* Colab button (#6389)

* Add colab button

* Add colab link for tutorials

* Fix links for open in colab (#6391)

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove dup (leftover from merge)

* convert the test into the new refactored format

* stick to using the current_step as is, without ++

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Alexander Measure <ameasure@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-08-11 17:56:41 -04:00
Sam Shleifer
be1520d3a3 rename prepare_translation_batch -> prepare_seq2seq_batch (#6103) 2020-08-11 15:57:07 -04:00
Sam Shleifer
66fa8ceaea PegasusForConditionalGeneration (torch version) (#6340)
Co-authored-by: Jingqing  Zhang <jingqing.zhang15@imperial.ac.uk>
2020-08-11 14:31:23 -04:00
guillaume-be
404782912a [Performance improvement] "Bad tokens ids" optimization (#6064)
* Optimized banned token masking

* Avoid duplicate EOS masking if in bad_words_id

* Updated mask generation to handle empty banned token list

* Addition of unit tests for the updated bad_words_ids masking

* Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test

* Updated timeout handling in `test_postprocess_next_token_scores_large_bad_words_list` unit test (timeout does not work on Windows)

* Moving Marian import to the test context to allow TF only environments to run

* Moving imports to torch_available test

* Updated operations device and test

* Updated operations device and test

* Added docstring and comment for in-place scores modification

* Moving test to own test_generation_utils, use of lighter models for testing

* removed unneded imports in test_modeling_common

* revert formatting change for ModelTesterMixin

* Updated caching, simplified eos token id test, removed unnecessary @require_torch

* formatting compliance
2020-08-11 05:56:40 -04:00
David LaPalomento
87e124c245 Warn if debug requested without TPU fixes (#6308) (#6390)
* Warn if debug requested without TPU fixes (#6308)
Check whether a PyTorch compatible TPU is available before attempting to print TPU metrics after training has completed. This way, users who apply `--debug` without reading the documentation aren't suprised by a stacktrace.

* Style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-11 05:31:26 -04:00
Junyuan Zheng
cdf1f7edb2 Fix tokenizer saving and loading error (#6026)
* fix tokenizer saving and loading bugs when adding AddedToken to additional special tokens

* Add tokenizer test

* Style

* Style 2

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-11 04:49:16 -04:00
Stas Bekman
83984a61c6 testing utils: capturing std streams context manager (#6231)
* testing utils: capturing std streams context manager

* style

* missing import

* add the origin of this code
2020-08-11 03:56:47 -04:00
Pradhy729
b25cec13c5 Feed forward chunking (#6024)
* Chunked feed forward for Bert

This is an initial implementation to test applying feed forward chunking for BERT.
Will need additional modifications based on output and benchmark results.

* Black and cleanup

* Feed forward chunking in BertLayer class.

* Isort

* add chunking for all models

* fix docs

* Fix typo

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-08-11 03:12:45 -04:00
Patrick von Platen
00bb0b25ed TF Longformer (#5764)
* improve names and tests longformer

* more and better tests for longformer

* add first tf test

* finalize tf basic op functions

* fix merge

* tf shape test passes

* narrow down discrepancies

* make longformer local attn tf work

* correct tf longformer

* add first global attn function

* add more global longformer func

* advance tf longformer

* finish global attn

* upload big model

* finish all tests

* correct false any statement

* fix common tests

* make all tests pass except keras save load

* fix some tests

* fix torch test import

* finish tests

* fix test

* fix torch tf tests

* add docs

* finish docs

* Update src/transformers/modeling_longformer.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_longformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply Lysandres suggestions

* reverse to assert statement because function will fail otherwise

* applying sylvains recommendations

* Update src/transformers/modeling_longformer.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/modeling_tf_longformer.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-10 23:25:06 +02:00
Patrick von Platen
3425936643 [EncoderDecoderModel] add a add_cross_attention boolean to config (#6377)
* correct encoder decoder model

* Apply suggestions from code review

* apply sylvains suggestions
2020-08-10 19:46:48 +02:00
Lysandre Debut
b99098abc7 Patch models (#6326)
* TFAlbertFor{TokenClassification, MultipleChoice}

* Patch models

* BERT and TF BERT info


s

* Update check_repo
2020-08-10 10:39:17 -04:00
Alexander Measure
3a556b0fb7 Update modeling_tf_utils.py (#6372)
fix typo: ckeckpoint->checkpoint
2020-08-10 02:55:11 -04:00
Patrick von Platen
1aec991643 [GPT2] Correct typo in docs (#6352) 2020-08-08 20:37:29 +02:00
Julien Plu
0e36e51515 Fix the tests for Electra (#6284)
* Fix the tests for Electra

* Apply style
2020-08-07 09:30:57 -04:00
Sylvain Gugger
6ba540b747 Add a script to check all models are tested and documented (#6298)
* Add a script to check all models are tested and documented

* Apply suggestions from code review

Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

* Address comments

Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
2020-08-07 09:18:37 -04:00
idoh
3be2d04884 fix consistency CrossEntropyLoss in modeling_bart (#6265) 2020-08-07 17:44:28 +08:00
Lysandre Debut
0d9328f2ef Patch GPU failures (#6281)
* Pin to 1.5.0

* Patch XLM GPU test
2020-08-07 02:58:15 -04:00
Patrick von Platen
118ecfd427 fix for pytorch < 1.6 (#6300) 2020-08-06 21:14:46 +02:00
Sam Shleifer
2804fff839 [s2s]Use prepare_translation_batch for Marian finetuning (#6293)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-06 14:58:38 -04:00
Teven
2f2aa0c89c added n_inner argument to gpt2 config (#6296) 2020-08-06 17:47:32 +02:00
Doug Blank
b923871bb7 Adds comet_ml to the list of auto-experiment loggers (#6176)
* Support for Comet.ml

* Need to import comet first

* Log this model, not the one in the backprop step

* Log args as hyperparameters; use framework to allow fine control

* Log hyperparameters with context

* Apply black formatting

* isort fix integrations

* isort fix __init__

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer_tf.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address review comments

* Style + Quality, remove Tensorboard import test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-06 11:31:30 -04:00
Philip May
d5bc32ce92 Add strip_accents to basic BertTokenizer. (#6280)
* Add strip_accents to basic tokenizer

* Add tests for strip_accents.

* fix style with black

* Fix strip_accents test

* empty commit to trigger CI

* Improved strip_accents check

* Add code quality with is not False
2020-08-06 18:52:28 +08:00
Sylvain Gugger
c67d1a0259 Tf model outputs (#6247)
* TF outputs and test on BERT

* Albert to DistilBert

* All remaining TF models except T5

* Documentation

* One file forgotten

* TF outputs and test on BERT

* Albert to DistilBert

* All remaining TF models except T5

* Documentation

* One file forgotten

* Add new models and fix issues

* Quality improvements

* Add T5

* A bit of cleanup

* Fix for slow tests

* Style
2020-08-05 11:34:39 -04:00
Teven
bd0eab351a Trainer + wandb quality of life logging tweaks (#6241)
* added `name` argument for wandb logging, also logging model config with trainer arguments

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added tf, post-review changes

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-05 09:05:52 -04:00
Julien Plu
33966811bd Add SequenceClassification and MultipleChoice TF models to Electra (#6227)
* Add SequenceClassification and MultipleChoice TF models to Electra

* Apply style

* Add summary_proj_to_labels to Electra config

* Finally mirroring the PT version of these models

* Apply style

* Fix Electra test
2020-08-05 09:04:27 -04:00
Zhu Baohe
d89acd07cc fix (#6257) 2020-08-05 07:37:57 -04:00
Ninnart Fuengfusin
24c5a6e351 Update optimization.py (#6261) 2020-08-05 07:34:57 -04:00
Lilian Bordeau
ed6b8f3128 Update to match renamed attributes in fairseq master (#5972)
* Update to match renamed attributes in fairseq master

RobertaModel no longer have model.encoder and args.num_classes attributes as of 5/28/20.

* Quality

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-05 07:23:55 -04:00
Joe Davison
972535ea74 fix zero shot pipeline docs (#6245) 2020-08-04 16:37:49 -04:00
Patrick von Platen
6c9ba1d8fc [Reformer] Make random seed generator available on random seed and not on model device (#6244)
* improve if else statement random seeds

* Apply suggestions from code review

* Update src/transformers/modeling_reformer.py
2020-08-04 13:22:43 -04:00
Sam Shleifer
d5b0a0e235 mBART Conversion script (#6230) 2020-08-04 09:53:51 -04:00
Stas Bekman
268bf34630 typo (#6225) 2020-08-04 09:31:49 -04:00
Andrés Felipe Cruz
7ea9b2db37 Encoder decoder config docs (#6195)
* Adding docs for how to load encoder_decoder pretrained model with individual config objects

* Adding docs for loading encoder_decoder config from pretrained folder

* Fixing  W293 blank line contains whitespace

* Update src/transformers/modeling_encoder_decoder.py

* Update src/transformers/modeling_encoder_decoder.py

* Update src/transformers/modeling_encoder_decoder.py

* Apply suggestions from code review

model file should only show examples for how to load save model

* Update src/transformers/configuration_encoder_decoder.py

* Update src/transformers/configuration_encoder_decoder.py

* fix space

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-08-04 09:23:28 +02:00
Gong Linyuan
b390a5672a Make the order of additional special tokens deterministic (#5704)
* Make the order of additional special tokens deterministic regardless of hash seeds

* Fix
2020-08-04 02:38:30 -04:00
Kevin Canwen Xu
3c289fb38c Remove outdated BERT tips (#6217)
* Remove out-dated BERT tips

* Update modeling_outputs.py

* Update bert.rst

* Update bert.rst
2020-08-04 01:17:56 +08:00
Sylvain Gugger
e4920c92d6 Doc pipelines (#6175)
* Init work on pipelines doc

* Work in progress

* Work in progress

* Doc pipelines

* Rm unwanted default

* Apply suggestions from code review

Lysandre comments

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-08-03 11:44:46 -04:00
Maurice Gonzenbach
06f1692b02 Fix _shift_right function in TFT5PreTrainedModel (#6214) 2020-08-03 16:21:23 +02:00
Suraj Patil
0b41867357 fix labels (#6213) 2020-08-03 10:19:35 -04:00
Jay Mody
cedc547e7e Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict output for logging. (#5331)
* Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict() output

* Update wandb config logging to use to_sanitized_dict

* removed n_gpu from sanitized dict

* fix quality check errors
2020-08-03 09:00:39 -04:00