Commit Graph

509 Commits

Author SHA1 Message Date
Antonio V Mendoza
ea2c6f1afc Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793)
* added template files for LXMERT and competed the configuration_lxmert.py

* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]

* added model card for lxmert

* cleaning up lxmert code

* Update src/transformers/modeling_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* tested torch lxmert, changed documtention, updated outputs, and other small fixes

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* renaming, other small issues, did not change TF code in this commit

* added lxmert question answering model in pytorch

* added capability to edit number of qa labels for lxmert

* made answer optional for lxmert question answering

* add option to return hidden_states for lxmert

* changed default qa labels for lxmert

* changed config archive path

* squshing 3 commits: merged UI + testing improvments + more UI and testing

* changed some variable names for lxmert

* TF LXMERT

* Various fixes to LXMERT

* Final touches to LXMERT

* AutoTokenizer order

* Add LXMERT to index.rst and README.md

* Merge commit test fixes + Style update

* TensorFlow 2.3.0 sequential model changes variable names

Remove inherited test

* Update src/transformers/modeling_tf_pytorch_utils.py

* Update docs/source/model_doc/lxmert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/lxmert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added suggestions

* Fixes

* Final fixes for TF model

* Fix docs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-03 04:02:25 -04:00
Puneetha Pai
4ebb52afdb test_tf_common: remove un_used mixin class parameters (#6866) 2020-09-02 10:54:40 -04:00
Stas Bekman
e71f32c0ef [testing] fix ambiguous test (#6898)
Since `generate()` does:
```
        num_beams = num_beams if num_beams is not None else self.config.num_beams
```
This test fails if `model.config.num_beams > 1` (which is the case in the model I'm porting).

This fix makes the test setup unambiguous by passing an explicit `num_beams=1` to `generate()`.

Thanks.
2020-09-02 16:18:17 +02:00
Suraj Patil
4230d30f77 [pipelines] Text2TextGenerationPipeline (#6744)
* add Text2TextGenerationPipeline

* remove max length warning

* remove comments

* remove input_length

* fix typo

* add tests

* use TFAutoModelForSeq2SeqLM

* doc

* typo

* add the doc below TextGenerationPipeline

* doc nit

* style

* delete comment
2020-09-02 07:34:35 -04:00
Patrick von Platen
afc4ece462 [Generate] Facilitate PyTorch generate using ModelOutputs (#6735)
* fix generate for GPT2 Double Head

* fix gpt2 double head model

* fix  bart / t5

* also add for no beam search

* fix no beam search

* fix encoder decoder

* simplify t5

* simplify t5

* fix t5 tests

* fix BART

* fix transfo-xl

* fix conflict

* integrating sylvains and sams comments

* fix tf past_decoder_key_values

* fix enc dec test
2020-09-01 12:38:25 +02:00
Sam Shleifer
8af1970e45 Fix marian slow test (#6854) 2020-08-31 16:10:43 -04:00
Huang Lianzhe
2de7ee0385 Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644)
* add datacollator and dataset for next sentence prediction task

* bug fix (numbers of special tokens & truncate sequences)

* bug fix (+ dict inputs support for data collator)

* add padding for nsp data collator; renamed cached files to avoid conflict.

* add test for nsp data collator

* Style

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-31 08:25:00 -04:00
Stas Bekman
563485bf95 [tests] fix typos in inputs (#6818) 2020-08-30 18:19:57 +08:00
Sam Shleifer
0f58903bb6 Pegasus finetune script: add --adafactor (#6811) 2020-08-29 17:43:32 -04:00
Sam Shleifer
3cac867fac t5 model should make decoder_attention_mask (#6800) 2020-08-28 15:22:33 -04:00
Sam Shleifer
20f7786453 Fix style (#6803) 2020-08-28 15:02:25 -04:00
Sam Shleifer
9336086ab5 prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654)
* broken test

* batch parity

* tests pass

* boom boom

* boom boom

* split out bart tokenizer tests

* fix tests

* boom boom

* Fixed dataset bug

* Fix marian

* Undo extra

* Get marian working

* Fix t5 tok tests

* Test passing

* Cleanup

* better assert msg

* require torch

* Fix mbart tests

* undo extra decoder_attn_mask change

* Fix import

* pegasus tokenizer can ignore src_lang kwargs

* unused kwarg test cov

* boom boom

* add todo for pegasus issue

* cover one word translation edge case

* Cleanup

* doc
2020-08-28 11:15:17 -04:00
RafaelWO
cb276b41de Transformer-XL: Improved tokenization with sacremoses (#6322)
* Improved tokenization with sacremoses

 * The TransfoXLTokenizer is now using sacremoses for tokenization
 * Added tokenization of comma-separated and floating point numbers.
 * Removed prepare_for_tokenization() from tokenization_transfo_xl.py because punctuation is handled by sacremoses
 * Added corresponding tests
 * Removed test comapring TransfoXLTokenizer and TransfoXLTokenizerFast
 * Added deprecation warning to TransfoXLTokenizerFast

* isort change

Co-authored-by: Teven <teven.lescao@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-08-28 09:56:17 -04:00
Stas Bekman
92ac2fa7d1 [transformers-cli] fix logger getter (#6777) 2020-08-27 20:01:17 -04:00
Lysandre
42fddacd1c Format 2020-08-27 18:31:51 +02:00
Stas Bekman
dbfe34f2f5 [test schedulers] adjust to test the first step's reading (#6429)
* [test schedulers] small improvement

* cleanup
2020-08-27 12:23:28 -04:00
Stas Bekman
e6b811f0a7 [testing] replace hardcoded paths to allow running tests from anywhere (#6523)
* [testing] replace hardcoded paths to allow running tests from anywhere

* fix the merge conflict
2020-08-27 12:22:18 -04:00
Nikolai Yakovenko
971d1802d0 Add AdaFactor optimizer from fairseq (#6722)
* AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM.

* update PR fixes, add basic test

* bug -- incorrect params in test

* bugfix -- import Adafactor into test

* bugfix -- removed accidental T5 include

* resetting T5 to master

* bugfix -- include Adafactor in __init__

* longer loop for adafactor test

* remove double error class declare

* lint

* black

* isort

* Update src/transformers/optimization.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* single docstring

* Cleanup docstring

Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-27 04:58:13 -04:00
Julien Chaumond
3242e4d942 [model_cards] Fix tiny typos 2020-08-26 23:16:06 +02:00
Patrick von Platen
858b7d5873 [TF Longformer] Improve Speed for TF Longformer (#6447)
* add tf graph compile tests

* fix conflict

* remove more tf transpose statements

* fix conflicts

* fix comment typos

* move function to class function

* fix black

* fix black

* make style
2020-08-26 14:55:41 -04:00
Lysandre
a75c64d80c Black 20 release 2020-08-26 17:20:22 +02:00
Lysandre Debut
77abd1e79f Centralize logging (#6434)
* Logging

* Style

* hf_logging > utils.logging

* Address @thomwolf's comments

* Update test

* Update src/transformers/benchmark/benchmark_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Revert bad change

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-26 11:10:36 -04:00
Sam Shleifer
624495706c T5Tokenizer adds EOS token if not already added (#5866)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-25 14:56:08 -04:00
Sam Shleifer
e11d923bfc Fix pegasus-xsum integration test (#6726) 2020-08-25 14:06:28 -04:00
Sylvain Gugger
abc0202194 More tests to Trainer (#6699)
* More tests to Trainer

* Add warning in the doc
2020-08-25 07:07:36 -04:00
Sylvain Gugger
a573777901 Update repo to isort v5 (#6686)
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
2020-08-24 11:03:01 -04:00
Sam Shleifer
5bf4465e6c Regression test for pegasus bugfix (#6606) 2020-08-20 15:34:43 -04:00
sgugger
86c07e634f One last threshold to raise 2020-08-20 14:23:09 -04:00
Sylvain Gugger
e8af90c052 Move threshold up for flaky test with Electra (#6622)
* Move threshold up for flaky test with Electra

* Update above as well
2020-08-20 13:59:40 -04:00
Patrick von Platen
505f2d749e [Tests] fix attention masks in Tests (#6621)
* fix distilbert

* fix typo
2020-08-20 13:23:47 -04:00
Denisa Roberts
c9454507cf Add tests for Reformer tokenizer (#6485) 2020-08-20 18:58:44 +02:00
Sylvain Gugger
573bdb0a5d Add tests to Trainer (#6605)
* Add tests to Trainer

* Test if removing long breaks everything

* Remove ugly hack

* Fix distributed test

* Use float for number of epochs
2020-08-20 11:13:50 -04:00
Suraj Patil
7581884dee [BartTokenizerFast] add prepare_seq2seq_batch (#6543) 2020-08-19 10:37:48 -04:00
Patrick von Platen
8bcceaceff fix model outputs test (#6593) 2020-08-19 16:18:51 +02:00
Pradhy729
2a7402cbd3 Feed forward chunking others (#6365)
* Feed forward chunking for Distilbert & Albert

* Added ff chunking for many other models

* Change model signature

* Added chunking for XLM

* Cleaned up by removing some variables.

* remove test_chunking flag

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-08-19 14:31:10 +02:00
Patrick von Platen
fe0b85e77a [EncoderDecoder] Add functionality to tie encoder decoder weights (#6538)
* start adding tie encoder to decoder functionality

* finish model tying

* make style

* Apply suggestions from code review

* fix t5 list including cross attention

* apply sams suggestions

* Update src/transformers/modeling_encoder_decoder.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add max depth break point

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-19 14:23:45 +02:00
Sam Shleifer
ab42d74850 Fix bart base test (#6587) 2020-08-18 21:28:10 -04:00
Sam Shleifer
1529bf9680 add BartConfig.force_bos_token_to_be_generated (#6526)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-18 19:15:50 -04:00
Sam Shleifer
12d7624199 [marian] converter supports models from new Tatoeba project (#6342) 2020-08-17 23:55:42 -04:00
Suraj Patil
407da12ef1 [T5Tokenizer] add prepare_seq2seq_batch method (#6122)
* tests
2020-08-17 13:57:19 -04:00
Suraj Patil
2a77813d53 [BartTokenizer] add prepare s2s batch (#6212)
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-08-17 11:44:46 -04:00
Funtowicz Morgan
b41cc0b86a Fix flaky ONNX tests (#6531) 2020-08-17 09:04:35 -04:00
Kevin Canwen Xu
37709b5909 Remove deprecated assertEquals (#6532)
`assertEquals` is deprecated: https://stackoverflow.com/questions/930995/assertequals-vs-assertequal-in-python/931011
This PR replaces these deprecated methods.
2020-08-17 17:13:58 +08:00
Masatoshi Suzuki
48c6c6139f Support additional dictionaries for BERT Japanese tokenizers (#6515)
* Update BERT Japanese tokenizers

* Update CircleCI config to download unidic

* Specify to use the latest dictionary packages
2020-08-17 12:00:23 +08:00
Patrick von Platen
1d6e71e116 [EncoderDecoder] Add Cross Attention for GPT2 (#6415)
* add cross attention layers for gpt2

* make gpt2 cross attention work

* finish bert2gpt2

* add explicit comments

* remove attention mask since not yet supported

* revert attn mask in pipeline

* Update src/transformers/modeling_gpt2.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_encoder_decoder.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-14 09:43:29 +02:00
Suraj Patil
680f1337c3 MBartForConditionalGeneration (#6441)
* add MBartForConditionalGeneration

* style

* rebase and fixes

* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS

* fix docs

* don't ignore mbart

* doc

* fix mbart fairseq link

* put mbart before bart

* apply doc suggestions
2020-08-14 03:21:16 -04:00
Lysandre Debut
f7cbc13db7 Test model outputs equivalence (#6445)
* Test model outputs equivalence

* Fix failing tests

* From dict to kwargs

* DistilBERT

* Addressing @sgugger and @patrickvonplaten's comments
2020-08-13 11:59:35 -04:00
Stas Bekman
e983da0e7d cleanup tf unittests: part 2 (#6260)
* cleanup torch unittests: part 2

* remove trailing comma added by isort, and which breaks flake

* one more comma

* revert odd balls

* part 3: odd cases

* more ["key"] -> .key refactoring

* .numpy() is not needed

* more unncessary .numpy() removed

* more simplification
2020-08-13 04:29:06 -04:00
Joe Davison
bc820476a5 add targets arg to fill-mask pipeline (#6239)
* add targets arg to fill-mask pipeline

* add tests and more error handling

* quality

* update docstring
2020-08-12 12:48:29 -04:00
Patrick von Platen
0735def8e1 [EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411)
* add encoder-decoder for roberta

* fix headmask

* apply Sylvains suggestions

* fix typo

* Apply suggestions from code review
2020-08-12 18:23:30 +02:00