Antonio V Mendoza
ea2c6f1afc
Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models ( #5793 )
...
* added template files for LXMERT and competed the configuration_lxmert.py
* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]
* added model card for lxmert
* cleaning up lxmert code
* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* tested torch lxmert, changed documtention, updated outputs, and other small fixes
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* renaming, other small issues, did not change TF code in this commit
* added lxmert question answering model in pytorch
* added capability to edit number of qa labels for lxmert
* made answer optional for lxmert question answering
* add option to return hidden_states for lxmert
* changed default qa labels for lxmert
* changed config archive path
* squshing 3 commits: merged UI + testing improvments + more UI and testing
* changed some variable names for lxmert
* TF LXMERT
* Various fixes to LXMERT
* Final touches to LXMERT
* AutoTokenizer order
* Add LXMERT to index.rst and README.md
* Merge commit test fixes + Style update
* TensorFlow 2.3.0 sequential model changes variable names
Remove inherited test
* Update src/transformers/modeling_tf_pytorch_utils.py
* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* added suggestions
* Fixes
* Final fixes for TF model
* Fix docs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-09-03 04:02:25 -04:00
Puneetha Pai
4ebb52afdb
test_tf_common: remove un_used mixin class parameters ( #6866 )
2020-09-02 10:54:40 -04:00
Stas Bekman
e71f32c0ef
[testing] fix ambiguous test ( #6898 )
...
Since `generate()` does:
```
num_beams = num_beams if num_beams is not None else self.config.num_beams
```
This test fails if `model.config.num_beams > 1` (which is the case in the model I'm porting).
This fix makes the test setup unambiguous by passing an explicit `num_beams=1` to `generate()`.
Thanks.
2020-09-02 16:18:17 +02:00
Suraj Patil
4230d30f77
[pipelines] Text2TextGenerationPipeline ( #6744 )
...
* add Text2TextGenerationPipeline
* remove max length warning
* remove comments
* remove input_length
* fix typo
* add tests
* use TFAutoModelForSeq2SeqLM
* doc
* typo
* add the doc below TextGenerationPipeline
* doc nit
* style
* delete comment
2020-09-02 07:34:35 -04:00
Patrick von Platen
afc4ece462
[Generate] Facilitate PyTorch generate using ModelOutputs ( #6735 )
...
* fix generate for GPT2 Double Head
* fix gpt2 double head model
* fix bart / t5
* also add for no beam search
* fix no beam search
* fix encoder decoder
* simplify t5
* simplify t5
* fix t5 tests
* fix BART
* fix transfo-xl
* fix conflict
* integrating sylvains and sams comments
* fix tf past_decoder_key_values
* fix enc dec test
2020-09-01 12:38:25 +02:00
Sam Shleifer
8af1970e45
Fix marian slow test ( #6854 )
2020-08-31 16:10:43 -04:00
Huang Lianzhe
2de7ee0385
Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task ( #6644 )
...
* add datacollator and dataset for next sentence prediction task
* bug fix (numbers of special tokens & truncate sequences)
* bug fix (+ dict inputs support for data collator)
* add padding for nsp data collator; renamed cached files to avoid conflict.
* add test for nsp data collator
* Style
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
2020-08-31 08:25:00 -04:00
Stas Bekman
563485bf95
[tests] fix typos in inputs ( #6818 )
2020-08-30 18:19:57 +08:00
Sam Shleifer
0f58903bb6
Pegasus finetune script: add --adafactor ( #6811 )
2020-08-29 17:43:32 -04:00
Sam Shleifer
3cac867fac
t5 model should make decoder_attention_mask ( #6800 )
2020-08-28 15:22:33 -04:00
Sam Shleifer
20f7786453
Fix style ( #6803 )
2020-08-28 15:02:25 -04:00
Sam Shleifer
9336086ab5
prepare_seq2seq_batch makes labels/ decoder_input_ids made later. ( #6654 )
...
* broken test
* batch parity
* tests pass
* boom boom
* boom boom
* split out bart tokenizer tests
* fix tests
* boom boom
* Fixed dataset bug
* Fix marian
* Undo extra
* Get marian working
* Fix t5 tok tests
* Test passing
* Cleanup
* better assert msg
* require torch
* Fix mbart tests
* undo extra decoder_attn_mask change
* Fix import
* pegasus tokenizer can ignore src_lang kwargs
* unused kwarg test cov
* boom boom
* add todo for pegasus issue
* cover one word translation edge case
* Cleanup
* doc
2020-08-28 11:15:17 -04:00
RafaelWO
cb276b41de
Transformer-XL: Improved tokenization with sacremoses ( #6322 )
...
* Improved tokenization with sacremoses
* The TransfoXLTokenizer is now using sacremoses for tokenization
* Added tokenization of comma-separated and floating point numbers.
* Removed prepare_for_tokenization() from tokenization_transfo_xl.py because punctuation is handled by sacremoses
* Added corresponding tests
* Removed test comapring TransfoXLTokenizer and TransfoXLTokenizerFast
* Added deprecation warning to TransfoXLTokenizerFast
* isort change
Co-authored-by: Teven <teven.lescao@gmail.com >
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
2020-08-28 09:56:17 -04:00
Stas Bekman
92ac2fa7d1
[transformers-cli] fix logger getter ( #6777 )
2020-08-27 20:01:17 -04:00
Lysandre
42fddacd1c
Format
2020-08-27 18:31:51 +02:00
Stas Bekman
dbfe34f2f5
[test schedulers] adjust to test the first step's reading ( #6429 )
...
* [test schedulers] small improvement
* cleanup
2020-08-27 12:23:28 -04:00
Stas Bekman
e6b811f0a7
[testing] replace hardcoded paths to allow running tests from anywhere ( #6523 )
...
* [testing] replace hardcoded paths to allow running tests from anywhere
* fix the merge conflict
2020-08-27 12:22:18 -04:00
Nikolai Yakovenko
971d1802d0
Add AdaFactor optimizer from fairseq ( #6722 )
...
* AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM.
* update PR fixes, add basic test
* bug -- incorrect params in test
* bugfix -- import Adafactor into test
* bugfix -- removed accidental T5 include
* resetting T5 to master
* bugfix -- include Adafactor in __init__
* longer loop for adafactor test
* remove double error class declare
* lint
* black
* isort
* Update src/transformers/optimization.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com >
* single docstring
* Cleanup docstring
Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com >
Co-authored-by: Sam Shleifer <sshleifer@gmail.com >
2020-08-27 04:58:13 -04:00
Julien Chaumond
3242e4d942
[model_cards] Fix tiny typos
2020-08-26 23:16:06 +02:00
Patrick von Platen
858b7d5873
[TF Longformer] Improve Speed for TF Longformer ( #6447 )
...
* add tf graph compile tests
* fix conflict
* remove more tf transpose statements
* fix conflicts
* fix comment typos
* move function to class function
* fix black
* fix black
* make style
2020-08-26 14:55:41 -04:00
Lysandre
a75c64d80c
Black 20 release
2020-08-26 17:20:22 +02:00
Lysandre Debut
77abd1e79f
Centralize logging ( #6434 )
...
* Logging
* Style
* hf_logging > utils.logging
* Address @thomwolf's comments
* Update test
* Update src/transformers/benchmark/benchmark_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Revert bad change
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-08-26 11:10:36 -04:00
Sam Shleifer
624495706c
T5Tokenizer adds EOS token if not already added ( #5866 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-08-25 14:56:08 -04:00
Sam Shleifer
e11d923bfc
Fix pegasus-xsum integration test ( #6726 )
2020-08-25 14:06:28 -04:00
Sylvain Gugger
abc0202194
More tests to Trainer ( #6699 )
...
* More tests to Trainer
* Add warning in the doc
2020-08-25 07:07:36 -04:00
Sylvain Gugger
a573777901
Update repo to isort v5 ( #6686 )
...
* Run new isort
* More changes
* Update CI, CONTRIBUTING and benchmarks
2020-08-24 11:03:01 -04:00
Sam Shleifer
5bf4465e6c
Regression test for pegasus bugfix ( #6606 )
2020-08-20 15:34:43 -04:00
sgugger
86c07e634f
One last threshold to raise
2020-08-20 14:23:09 -04:00
Sylvain Gugger
e8af90c052
Move threshold up for flaky test with Electra ( #6622 )
...
* Move threshold up for flaky test with Electra
* Update above as well
2020-08-20 13:59:40 -04:00
Patrick von Platen
505f2d749e
[Tests] fix attention masks in Tests ( #6621 )
...
* fix distilbert
* fix typo
2020-08-20 13:23:47 -04:00
Denisa Roberts
c9454507cf
Add tests for Reformer tokenizer ( #6485 )
2020-08-20 18:58:44 +02:00
Sylvain Gugger
573bdb0a5d
Add tests to Trainer ( #6605 )
...
* Add tests to Trainer
* Test if removing long breaks everything
* Remove ugly hack
* Fix distributed test
* Use float for number of epochs
2020-08-20 11:13:50 -04:00
Suraj Patil
7581884dee
[BartTokenizerFast] add prepare_seq2seq_batch ( #6543 )
2020-08-19 10:37:48 -04:00
Patrick von Platen
8bcceaceff
fix model outputs test ( #6593 )
2020-08-19 16:18:51 +02:00
Pradhy729
2a7402cbd3
Feed forward chunking others ( #6365 )
...
* Feed forward chunking for Distilbert & Albert
* Added ff chunking for many other models
* Change model signature
* Added chunking for XLM
* Cleaned up by removing some variables.
* remove test_chunking flag
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com >
2020-08-19 14:31:10 +02:00
Patrick von Platen
fe0b85e77a
[EncoderDecoder] Add functionality to tie encoder decoder weights ( #6538 )
...
* start adding tie encoder to decoder functionality
* finish model tying
* make style
* Apply suggestions from code review
* fix t5 list including cross attention
* apply sams suggestions
* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* add max depth break point
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-08-19 14:23:45 +02:00
Sam Shleifer
ab42d74850
Fix bart base test ( #6587 )
2020-08-18 21:28:10 -04:00
Sam Shleifer
1529bf9680
add BartConfig.force_bos_token_to_be_generated ( #6526 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-08-18 19:15:50 -04:00
Sam Shleifer
12d7624199
[marian] converter supports models from new Tatoeba project ( #6342 )
2020-08-17 23:55:42 -04:00
Suraj Patil
407da12ef1
[T5Tokenizer] add prepare_seq2seq_batch method ( #6122 )
...
* tests
2020-08-17 13:57:19 -04:00
Suraj Patil
2a77813d53
[BartTokenizer] add prepare s2s batch ( #6212 )
...
Co-authored-by: sgugger <sylvain.gugger@gmail.com >
2020-08-17 11:44:46 -04:00
Funtowicz Morgan
b41cc0b86a
Fix flaky ONNX tests ( #6531 )
2020-08-17 09:04:35 -04:00
Kevin Canwen Xu
37709b5909
Remove deprecated assertEquals ( #6532 )
...
`assertEquals` is deprecated: https://stackoverflow.com/questions/930995/assertequals-vs-assertequal-in-python/931011
This PR replaces these deprecated methods.
2020-08-17 17:13:58 +08:00
Masatoshi Suzuki
48c6c6139f
Support additional dictionaries for BERT Japanese tokenizers ( #6515 )
...
* Update BERT Japanese tokenizers
* Update CircleCI config to download unidic
* Specify to use the latest dictionary packages
2020-08-17 12:00:23 +08:00
Patrick von Platen
1d6e71e116
[EncoderDecoder] Add Cross Attention for GPT2 ( #6415 )
...
* add cross attention layers for gpt2
* make gpt2 cross attention work
* finish bert2gpt2
* add explicit comments
* remove attention mask since not yet supported
* revert attn mask in pipeline
* Update src/transformers/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2020-08-14 09:43:29 +02:00
Suraj Patil
680f1337c3
MBartForConditionalGeneration ( #6441 )
...
* add MBartForConditionalGeneration
* style
* rebase and fixes
* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS
* fix docs
* don't ignore mbart
* doc
* fix mbart fairseq link
* put mbart before bart
* apply doc suggestions
2020-08-14 03:21:16 -04:00
Lysandre Debut
f7cbc13db7
Test model outputs equivalence ( #6445 )
...
* Test model outputs equivalence
* Fix failing tests
* From dict to kwargs
* DistilBERT
* Addressing @sgugger and @patrickvonplaten's comments
2020-08-13 11:59:35 -04:00
Stas Bekman
e983da0e7d
cleanup tf unittests: part 2 ( #6260 )
...
* cleanup torch unittests: part 2
* remove trailing comma added by isort, and which breaks flake
* one more comma
* revert odd balls
* part 3: odd cases
* more ["key"] -> .key refactoring
* .numpy() is not needed
* more unncessary .numpy() removed
* more simplification
2020-08-13 04:29:06 -04:00
Joe Davison
bc820476a5
add targets arg to fill-mask pipeline ( #6239 )
...
* add targets arg to fill-mask pipeline
* add tests and more error handling
* quality
* update docstring
2020-08-12 12:48:29 -04:00
Patrick von Platen
0735def8e1
[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer ( #6411 )
...
* add encoder-decoder for roberta
* fix headmask
* apply Sylvains suggestions
* fix typo
* Apply suggestions from code review
2020-08-12 18:23:30 +02:00