Commit Graph

342 Commits

Author SHA1 Message Date
Suraj Patil
ef2dcdccaa ElectraForQuestionAnswering (#4913)
* ElectraForQuestionAnswering

* udate __init__

* add test for electra qa model

* add ElectraForQuestionAnswering in auto models

* add ElectraForQuestionAnswering in all_model_classes

* fix outputs, input_ids defaults to None

* add ElectraForQuestionAnswering in docs

* remove commented line
2020-06-10 15:17:52 -04:00
Amil Khare
5d63ca6c38 [ctrl] fix pruning of MultiHeadAttention (#4904) 2020-06-10 14:06:55 -04:00
Sylvain Gugger
4e10acb3e5 Add more models to common tests (#4910) 2020-06-10 13:19:53 -04:00
Sylvain Gugger
ac99217e92 Fix the CI (#4903)
* Fix CI
2020-06-10 09:26:06 -04:00
Sylvain Gugger
0a375f5abd Deal with multiple choice in common tests (#4886)
* Deal with multiple choice in common tests
2020-06-10 08:10:20 -04:00
Bharat Raghunathan
6e603cb789 [All models] Extend config.output_attentions with output_attentions function arguments (#4538)
* DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions``

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* Fix further regressions in tests relating to `output_attentions`

Ensure proper propagation of `output_attentions` as a function parameter
to all model subclasses

* Fix more regressions in `test_output_attentions`

* Fix issues with BertEncoder

* Rename related variables to `output_attentions`

* fix pytorch tests

* fix bert and gpt2 tf

* Fix most TF tests for `test_output_attentions`

* Fix linter errors and more TF tests

* fix conflicts

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* fix pytorch tests

* fix conflicts

* fix conflicts

* Fix linter errors and more TF tests

* fix tf tests

* make style

* fix isort

* improve output_attentions

* improve tensorflow

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-06-09 23:39:06 +02:00
Patrick von Platen
2cfb947f59 [Benchmark] add tpu and torchscipt for benchmark (#4850)
* add tpu and torchscipt for benchmark

* fix name in tests

* "fix email"

* make style

* better log message for tpu

* add more print and info for tpu

* allow possibility to print tpu metrics

* correct cpu usage

* fix test for non-install

* remove bugus file

* include psutil in testing

* run a couple of times before tracing in torchscript

* do not allow tpu memory tracing for now

* make style

* add torchscript to env

* better name for torch tpu

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2020-06-09 23:12:43 +02:00
Patrick von Platen
c0554776de fix PR (#4810) 2020-06-08 15:31:12 +02:00
Sam Shleifer
c58e6c129a [marian tests ] pass device to pipeline (#4815) 2020-06-06 00:52:17 -04:00
Sam Shleifer
4ab7424597 [cleanup/marian] pipelines test and new kwarg (#4812) 2020-06-05 18:45:19 -04:00
Patrick von Platen
8cca875569 [EncoderDecoderConfig] automatically set decoder config to decoder (#4809)
* automatically set decoder config to decoder

* add more tests
2020-06-05 23:16:37 +02:00
Sylvain Gugger
f1fe18465d Use labels to remove deprecation warnings (#4807) 2020-06-05 16:41:46 -04:00
Sylvain Gugger
4dd5cf2207 Fix argument label (#4792)
* Fix argument label

* Fix test
2020-06-05 15:20:29 -04:00
Julien Plu
f9414f7553 Tensorflow improvements (#4530)
* Better None gradients handling

* Apply Style

* Apply Style

* Create a loss class per task to compute its respective loss

* Add loss classes to the ALBERT TF models

* Add loss classes to the BERT TF models

* Add question answering and multiple choice to TF Camembert

* Remove prints

* Add multiple choice model to TF DistilBERT + loss computation

* Add question answering model to TF Electra + loss computation

* Add token classification, question answering and multiple choice models to TF Flaubert

* Add multiple choice model to TF Roberta + loss computation

* Add multiple choice model to TF XLM + loss computation

* Add multiple choice and question answering models to TF XLM-Roberta

* Add multiple choice model to TF XLNet + loss computation

* Remove unused parameters

* Add task loss classes

* Reorder TF imports + add new model classes

* Add new model classes

* Bugfix in TF T5 model

* Bugfix for TF T5 tests

* Bugfix in TF T5 model

* Fix TF T5 model tests

* Fix T5 tests + some renaming

* Fix inheritance issue in the AutoX tests

* Add tests for TF Flaubert and TF XLM Roberta

* Add tests for TF Flaubert and TF XLM Roberta

* Remove unused piece of code in the TF trainer

* bugfix and remove unused code

* Bugfix for TF 2.2

* Apply Style

* Divide TFSequenceClassificationAndMultipleChoiceLoss into their two respective name

* Apply style

* Mirror the PT Trainer in the TF one: fp16, optimizers and tb_writer as class parameter and better dataset handling

* Fix TF optimizations tests and apply style

* Remove useless parameter

* Bugfix and apply style

* Fix TF Trainer prediction

* Now the TF models return the loss such as their PyTorch couterparts

* Apply Style

* Ignore some tests output

* Take into account the SQuAD cls_index, p_mask and is_impossible parameters for the QuestionAnswering task models.

* Fix names for SQuAD data

* Apply Style

* Fix conflicts with 2.11 release

* Fix conflicts with 2.11

* Fix wrongname

* Add better documentation on the new create_optimizer function

* Fix isort

* logging_dir: use same default as PyTorch

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-04 19:45:53 -04:00
Funtowicz Morgan
5bf9afbf35 Introduce a new tensor type for return_tensors on tokenizer for NumPy (#4585)
* Refactor tensor creation in tokenizers.

* Make sure to convert string to TensorType

* Refactor convert_to_tensors_

* Introduce numpy tensor creation

* Format

* Add unittest for TensorType creation from str

* sorting imports

* Added unittests for numpy tensor conversion.

* Do not use in-place version for squeeze as numpy doesn't provide such feature.

* Added extra parameter prepend_batch_axis: bool on prepare_for_model.

* Ensure test_np_encode_plus_sent_to_model is not executed if encoder/decoder model.

* style.

* numpy tests require_torch for now while flax not merged.

* Hopefully will make flake8 happy.

* One more time 🎶
2020-06-04 06:57:01 +02:00
Sylvain Gugger
1b5820a565 Unify label args (#4722)
* Deprecate masked_lm_labels argument

* Apply to all models

* Better error message
2020-06-03 09:36:26 -04:00
Patrick von Platen
9ca485734a [Reformer] Improved memory if input is shorter than chunk length (#4720)
* improve handling of short inputs for reformer

* correct typo in assert statement

* fix other tests
2020-06-02 23:08:39 +02:00
Sam Shleifer
70f7423436 TFRobertaModelIntegrationTest requires tf (#4726) 2020-06-02 12:59:00 -04:00
Julien Chaumond
b42586ea56 Fix CI after killing archive maps (#4724)
Some checks failed
GitHub-hosted runner / check_code_quality (push) Has been cancelled
* 🐛 Fix model ids for BART and Flaubert
2020-06-02 10:21:09 -04:00
Julien Chaumond
d4c2cb402d Kill model archive maps (#4636)
* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI
2020-06-02 09:39:33 -04:00
Rens
ec62b7d953 Fix onnx export input names order (#4641)
* pass on tokenizer to pipeline

* order input names when convert to onnx

* update style

* remove unused imports

* make ordered inputs list needs to be mutable

* add test custom bert model

* remove unused imports
2020-06-01 16:12:48 +02:00
Patrick von Platen
0866669e75 [EncoderDecoder] Fix initialization and save/load bug (#4680)
* fix bug

* add more tests
2020-05-30 01:25:19 +02:00
Patrick von Platen
56ee2560be [Longformer] Better handling of global attention mask vs local attention mask (#4672)
* better api

* improve automatic setting of global attention mask

* fix longformer bug

* fix global attention mask in test

* fix global attn mask flatten

* fix slow tests

* update docstring

* update docs and make more robust

* improve attention mask
2020-05-29 17:58:42 +02:00
Patrick von Platen
9c17256447 [Longformer] Multiple choice for longformer (#4645)
* add multiple choice for longformer

* add models to docs

* adapt docstring

* add test to longformer

* add longformer for mc in init and modeling auto

* fix tests
2020-05-29 13:46:08 +02:00
Anthony MOI
5e737018e1 Fix add_special_tokens on fast tokenizers (#4531) 2020-05-28 10:54:45 -04:00
Suraj Patil
e444648a30 LongformerForTokenClassification (#4638) 2020-05-28 12:48:18 +02:00
Patrick von Platen
96f57c9ccb [Benchmark] Memory benchmark utils (#4198)
* improve memory benchmarking

* correct typo

* fix current memory

* check torch memory allocated

* better pytorch function

* add total cached gpu memory

* add total gpu required

* improve torch gpu usage

* update memory usage

* finalize memory tracing

* save intermediate benchmark class

* fix conflict

* improve benchmark

* improve benchmark

* finalize

* make style

* improve benchmarking

* correct typo

* make train function more flexible

* fix csv save

* better repr of bytes

* better print

* fix __repr__ bug

* finish plot script

* rename plot file

* delete csv and small improvements

* fix in plot

* fix in plot

* correct usage of timeit

* remove redundant line

* remove redundant line

* fix bug

* add hf parser tests

* add versioning and platform info

* make style

* add gpu information

* ensure backward compatibility

* finish adding all tests

* Update src/transformers/benchmark/benchmark_args.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/benchmark/benchmark_args_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* delete csv files

* fix isort ordering

* add out of memory handling

* add better train memory handling

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-05-27 23:22:16 +02:00
Suraj Patil
ec4cdfdd05 LongformerForSequenceClassification (#4580)
* LongformerForSequenceClassification

* better naming x=>hidden_states, fix typo in doc

* Update src/transformers/modeling_longformer.py

* Update src/transformers/modeling_longformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-27 22:30:00 +02:00
Sam Shleifer
07797c4da4 [testing] LanguageModelGenerationTests require_tf or require_torch (#4616) 2020-05-27 09:10:26 -04:00
Sam Shleifer
b86e42e0ac [ci] fix 3 remaining slow GPU failures (#4584) 2020-05-25 19:20:50 -04:00
Suraj Patil
03d8527de0 Longformer for question answering (#4500)
* added LongformerForQuestionAnswering

* add LongformerForQuestionAnswering

* fix import for LongformerForMaskedLM

* add LongformerForQuestionAnswering

* hardcoded sep_token_id

* compute attention_mask if not provided

* combine global_attention_mask with attention_mask when provided

* update example in  docstring

* add assert error messages, better attention combine

* add test for longformerForQuestionAnswering

* typo

* cast gloabl_attention_mask to long

* make style

* Update src/transformers/configuration_longformer.py

* Update src/transformers/configuration_longformer.py

* fix the code quality

* Merge branch 'longformer-for-question-answering' of https://github.com/patil-suraj/transformers into longformer-for-question-answering

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-25 18:43:36 +02:00
Anthony MOI
35df911485 Fix convert_token_type_ids_from_sequences for fast tokenizers (#4503) 2020-05-22 12:45:10 -04:00
Frankie Liuzzi
bd6e301832 added functionality for electra classification head (#4257)
* added functionality for electra classification head

* unneeded dropout

* Test ELECTRA for sequence classification

* Style

Co-authored-by: Frankie <frankie@frase.io>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-05-22 09:48:21 -04:00
Zhangyx
49296533ca Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463)
* Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website.

* Use Split enum + always output the label name

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-21 09:17:44 -04:00
Julien Chaumond
865d4d595e [ci] Close #4481 2020-05-20 18:27:42 -04:00
Julien Chaumond
a3af8e86cb Update test_trainer_distributed.py 2020-05-20 18:26:51 -04:00
Lysandre Debut
14cb5b35fa Fix slow gpu tests lysandre (#4487)
* There is one missing key in BERT

* Correct device for CamemBERT model

* RoBERTa tokenization adding prefix space

* Style
2020-05-20 11:59:45 -04:00
Sam Shleifer
efbc1c5a9d [MarianTokenizer] implement save_vocabulary and other common methods (#4389) 2020-05-19 19:45:49 -04:00
Sam Shleifer
956c4c4eb4 [gpu slow tests] fix mbart-large-enro gpu tests (#4472) 2020-05-19 19:45:31 -04:00
Patrick von Platen
aa925a52fa [Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468)
* fix gpu slow tests in pytorch

* change model to device syntax
2020-05-19 21:35:04 +02:00
Sam Shleifer
07dd7c2fd8 [cleanup] test_tokenization_common.py (#4390) 2020-05-19 10:46:55 -04:00
Iz Beltagy
8f1d047148 Longformer (#4352)
* first commit

* bug fixes

* better examples

* undo padding

* remove wrong VOCAB_FILES_NAMES

* License

* make style

* make isort happy

* unit tests

* integration test

* make `black` happy by undoing `isort` changes!!

* lint

* no need for the padding value

* batch_size not bsz

* remove unused type casting

* seqlen not seq_len

* staticmethod

* `bert` selfattention instead of `n2`

* uint8 instead of bool + lints

* pad inputs_embeds using embeddings not a constant

* black

* unit test with padding

* fix unit tests

* remove redundant unit test

* upload model weights

* resolve todo

* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_

* increase unittest coverage
2020-05-19 16:04:43 +02:00
Julien Chaumond
5e7fe8b585 Distributed eval: SequentialDistributedSampler + gather all results (#4243)
* Distributed eval: SequentialDistributedSampler + gather all results

* For consistency only write to disk from world_master

Close https://github.com/huggingface/transformers/issues/4272

* Working distributed eval

* Hook into scripts

* Fix #3721 again

* TPU.mesh_reduce: stay in tensor space

Thanks @jysohn23

* Just a small comment

* whitespace

* torch.hub: pip install packaging

* Add test scenarii
2020-05-18 22:02:39 -04:00
Julien Chaumond
4c06893610 Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300)
* Test case for #3936

* multigpu tests pass on pytorch 1.4.0

* Fixup

* multigpu tests pass on pytorch 1.5.0

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* rename multigpu to require_multigpu

* mode doc
2020-05-18 20:34:50 -04:00
Sam Shleifer
a699525d25 [test_pipelines] Mark tests > 10s @slow, small speedups (#4421) 2020-05-18 12:23:21 -04:00
Patrick von Platen
026a5d0888 [T5 fp16] Fix fp16 in T5 (#4436)
* fix fp16 in t5

* make style

* refactor invert_attention_mask fn

* fix typo
2020-05-18 17:25:58 +02:00
Funtowicz Morgan
31c799a0c9 Tag onnx export tests as slow (#4432) 2020-05-18 09:24:41 -04:00
Lorenzo Ampil
18d233d525 Allow the creation of "entity groups" for NerPipeline #3548 (#3957)
* Add index to be returned by NerPipeline to allow for the creation of

* Add entity groups

* Convert entity list to dict

* Add entity to entity_group_disagg atfter updating entity gorups

* Change 'group' parameter to 'grouped_entities'

* Add unit tests for grouped NER pipeline case

* Correct variable name typo for NER_FINETUNED_MODELS

* Sync grouped tests to recent test updates
2020-05-17 09:25:17 +02:00
Funtowicz Morgan
db0076a9df Conversion script to export transformers models to ONNX IR. (#4253)
* Added generic ONNX conversion script for PyTorch model.

* WIP initial TF support.

* TensorFlow/Keras ONNX export working.

* Print framework version info

* Add possibility to check the model is correctly loading on ONNX runtime.

* Remove quantization option.

* Specify ONNX opset version when exporting.

* Formatting.

* Remove unused imports.

* Make functions more generally reusable from other part of the code.

* isort happy.

* flake happy

* Export only feature-extraction for now

* Correctly check inputs order / filter before export.

* Removed task variable

* Fix invalid args call in load_graph_from_args.

* Fix invalid args call in convert.

* Fix invalid args call in infer_shapes.

* Raise exception and catch in caller function instead of exit.

* Add 04-onnx-export.ipynb notebook

* More WIP on the notebook

* Remove unused imports

* Simplify & remove unused constants.

* Export with constant_folding in PyTorch

* Let's try to put function args in the right order this time ...

* Disable external_data_format temporary

* ONNX notebook draft ready.

* Updated notebooks charts + wording

* Correct error while exporting last chart in notebook.

* Adressing @LysandreJik comment.

* Set ONNX opset to 11 as default value.

* Set opset param mandatory

* Added ONNX export unittests

* Quality.

* flake8 happy

* Add keras2onnx dependency on extras["tf"]

* Pin keras2onnx on github master to v1.6.5

* Second attempt.

* Third attempt.

* Use the right repo URL this time ...

* Do the same for onnxconverter-common

* Added keras2onnx and onnxconveter-common to 1.7.0 to supports TF2.2

* Correct commit hash.

* Addressing PR review: Optimization are enabled by default.

* Addressing PR review: small changes in the notebook

* setup.py comment about keras2onnx versioning.
2020-05-14 16:35:52 -04:00
Sam Shleifer
7822cd38a0 [tests] make pipelines tests faster with smaller models (#4238)
covers torch and tf. Also fixes a failing @slow test
2020-05-14 13:36:02 -04:00