Commit Graph

1767 Commits

Author SHA1 Message Date
lewtun
4bb1d0ec84 Skip RoFormer ONNX test if rjieba not installed (#16981)
* Skip RoFormer ONNX test if rjieba not installed

* Update deps table

* Skip RoFormer serialization test

* Fix RoFormer vocab

* Add rjieba to CircleCI
2022-05-04 10:04:10 +02:00
Sylvain Gugger
1c9fcd0e04 Fix RNG reload in resume training from epoch checkpoint (#17055)
* Fix RNG reload in resume training from epoch checkpoint

* Fix test
2022-05-03 10:31:24 -04:00
Sylvain Gugger
a8fa2f91f4 Make Trainer compatible with sharded checkpoints (#17053)
* Make Trainer compatible with sharded checkpoints

* Add doc
2022-05-03 09:55:10 -04:00
Yih-Dar
19420fd99e Move test model folders (#17034)
* move test model folders (TODO: fix imports and others)

* fix (potentially partially) imports (in model test modules)

* fix (potentially partially) imports (in tokenization test modules)

* fix (potentially partially) imports (in feature extraction test modules)

* fix import utils.test_modeling_tf_core

* fix path ../fixtures/

* fix imports about generation.test_generation_flax_utils

* fix more imports

* fix fixture path

* fix get_test_dir

* update module_to_test_file

* fix get_tests_dir from wrong transformers.utils

* update config.yml (CircleCI)

* fix style

* remove missing imports

* update new model script

* update check_repo

* update SPECIAL_MODULE_TO_TEST_MAP

* fix style

* add __init__

* update self-scheduled

* fix add_new_model scripts

* check one way to get location back

* python setup.py build install

* fix import in test auto

* update self-scheduled.yml

* update slack notification script

* Add comments about artifact names

* fix for yolos

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-03 14:42:02 +02:00
Sanchit Gandhi
cd9274d010 [FlaxBert] Add ForCausalLM (#16995)
* [FlaxBert] Add ForCausalLM

* make style

* fix output attentions

* Add RobertaForCausalLM

* remove comment

* fix fx-to-pt model loading

* remove comment

* add modeling tests

* add enc-dec model tests

* add big_bird

* add electra

* make style

* make repo-consitency

* add to docs

* remove roberta test

* quality

* amend cookiecutter

* fix attention_mask bug in flax bert model tester

* tighten pt-fx thresholds to 1e-5

* add 'copied from' statements

* amend 'copied from' statements

* amend 'copied from' statements

* quality
2022-05-03 11:26:19 +02:00
Patrick von Platen
31616b8d61 [T5 Tokenizer] Model has no fixed position ids - there is no hardcode… (#16990)
* [T5 Tokenizer] Model has no fixed position ids - there is no hardcoded max length

* [T5 Tokenizer] Model has no fixed position ids - there is no hardcoded max length

* correct t5 tokenizer

* correct t5 tokenizer

* fix test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* finish

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-02 21:27:34 +02:00
NielsRogge
1ac698744c Add YOLOS (#16848)
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-05-02 18:30:55 +02:00
NielsRogge
2de2c9ecca Clean up vision tests (#17024)
* Clean up tests

* Make fixup

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-05-02 16:28:58 +02:00
Joao Gante
fb0ae12947 TF: XLA bad words logits processor and list of processors (#16974) 2022-04-29 15:54:58 +01:00
Yih-Dar
e952e049b4 use scale=1.0 in floats_tensor called in speech model testers (#17007)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-29 14:41:33 +02:00
Yih-Dar
5af5735f62 set eos_token_id to None to generate until max length (#16989)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-28 19:47:38 +02:00
Yih-Dar
49d5bcb0f3 Fix HubertRobustTest PT/TF equivalence test on GPU (#16943)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-27 10:50:03 +02:00
Krishna Sirumalla
aaee4038c3 Add onnx config for RoFormer (#16861)
* add roformer onnx config
2022-04-26 16:51:15 +02:00
code-review-doctor
6568752039 Fix issue probably-meant-fstring found at https://codereview.doctor (#16913) 2022-04-25 15:15:00 -04:00
Joao Gante
e03966e404 TF: XLA stable softmax (#16892)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-25 20:10:51 +01:00
Rushi Chaudhari
8246caf3eb added deit onnx config (#16887)
* added deit onnx config
2022-04-25 20:50:45 +02:00
Joao Gante
9331b37967 TF: XLA Logits Warpers (#16899)
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-25 19:48:08 +01:00
Joao Gante
809dac48f9 TF: XLA logits processors - minimum length, forced eos, and forced bos (#16912)
* XLA min len, forced eos, and forced bos

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-25 19:27:53 +01:00
Yih-Dar
32adbb26d6 Fix PyTorch RAG tests GPU OOM (#16881)
* add torch.cuda.empty_cache in some PT RAG tests

* torch.cuda.empty_cache in tearDownModule()

* tearDown()

* add gc.collect()

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-25 17:33:56 +02:00
Thomas Chaigneau
508baf1943 add bigbird typo fixes (#16897)
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
2022-04-25 11:32:06 +02:00
Joao Gante
99c8226b12 TF: XLA repetition penalty (#16879) 2022-04-22 18:29:32 +01:00
Thomas Chaigneau
ec81c11a18 Add OnnxConfig for ConvBERT (#16859)
* add OnnxConfig for ConvBert

Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
2022-04-22 18:19:15 +02:00
Joao Gante
6d90d76f5d TF: rework XLA generate tests (#16866) 2022-04-22 12:38:08 +01:00
Sylvain Gugger
cb555af2c7 Return input_ids in ImageGPT feature extractor (#16872) 2022-04-21 09:09:00 -04:00
Nicolas Patry
6620f60c0a Long QuestionAnsweringPipeline fix. (#16778)
* Temporary commit witht the long QA fix.

* Adding slow tests covering this fix.

* Removing fast test as it doesn't fail anyway.
2022-04-21 09:59:25 +02:00
Nicolas Patry
e13a91fe60 Fixing return type tensor with num_return_sequences>1. (#16828)
* Fixing return type tensor with `num_return_sequences>1`.

* Nit.
2022-04-20 16:11:51 +02:00
Yang Ming
ff06b17791 add DebertaV2 fast tokenizer (#15529)
Co-authored-by: alcinos <carion.nicolas@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-20 10:26:51 +02:00
Manuel R. Ciosici
3104036e7f Add support for bitsandbytes (#15622)
* Add initial BNB integration

* fixup! Add initial BNB integration

* Add bnb test decorator

* Update Adamw8bit option name

* Use the full bnb package name

* Overide bnb for all embedding layers

* Fix package name

* Formatting

* Remove unnecessary import

* Update src/transformers/trainer.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Rename AdamwBNB optimizer option

* Add training test checking that bnb memory utilization is lower

* fix merge

* fix merge; fix + extend new test

* cleanup

* expand bnb

* move all require_* candidates to testing_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-04-19 16:01:29 -04:00
Yih-Dar
e6d23a4b9b Improve test_pt_tf_model_equivalence on PT side (#16731)
* Update test_pt_tf_model_equivalence on PT side

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-19 21:13:27 +02:00
Joao Gante
f09c45e067 TF: Add sigmoid activation function (#16819) 2022-04-19 16:13:08 +01:00
Ella Charlaix
77de8d6c31 Add onnx export of models with a multiple choice classification head (#16758)
* Add export of models with a multiple-choice classification head
2022-04-19 15:51:51 +02:00
code-review-doctor
a2392415e9 Some tests misusing assertTrue for comparisons fix (#16771)
* Fix issue avoid-misusing-assert-true found at https://codereview.doctor

* fix tests

* fix tf

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:44:08 +02:00
Suraj Patil
d3bd9ac728 [Flax] improve large model init and loading (#16148)
* begin do_init

* add params_shape_tree

* raise error if params are accessed when do_init is False

* don't allow do_init=False when keys are missing

* make shape tree a property

* assign self._params at the end

* add test for do_init

* add do_init arg to all flax models

* fix param setting

* disbale do_init for composite models

* update test

* add do_init in FlaxBigBirdForMultipleChoice

* better names and errors

* improve test

* style

* add a warning when do_init=False

* remove extra if

* set params after _required_params

* add test for from_pretrained

* do_init => _do_init

* chage warning to info

* fix typo

* add params in init_weights

* add params to gpt neo init

* add params to init_weights

* update do_init test

* Trigger CI

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update template

* trigger CI

* style

* style

* fix template

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:19:55 +02:00
NielsRogge
494c2a8c4d Clean up semantic segmentation tests (#16801)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-19 09:02:19 +02:00
jsnfly
51e0ebedcb Allow passing encoder_ouputs as tuple to EncoderDecoder Models (#16814)
* Add passing encoder_outputs as tuple to existing test

* Add check for tuple

* Add check for tuple also for speech and vision

Co-authored-by: jsnfly <jsnfly@gmx.de>
2022-04-18 19:49:58 +02:00
Patrick von Platen
8d3f952adb [Data2Vec] Add data2vec vision (#16760)
* save intermediate

* add vision

* add vision

* save

* finish models

* finish models

* continue

* finish

* up

* up

* up

* tests all pass

* clean up

* up

* up

* fix bugs in beit

* correct docs

* finish

* finish docs

* make style

* up

* more fixes

* fix type hint

* make style

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/data2vec/test_modeling_data2vec_vision.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix test

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-18 17:52:13 +02:00
NielsRogge
d3c9d0e55f [ViT, BEiT, DeiT, DPT] Improve code (#16799)
* Improve code

* Fix bugs

* Fix another bug

* Clean up DTP as well

* Update DPT model outputs

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-18 09:25:08 -04:00
Joao Gante
b4ddd2677c TF generate refactor - XLA sample (#16713) 2022-04-18 10:58:24 +01:00
Stas Bekman
5da33f8729 [modeling utils] revamp from_pretrained(..., low_cpu_mem_usage=True) + tests (#16657)
* add low_cpu_mem_usage tests

* wip: revamping

* wip

* install /usr/bin/time

* wip

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* fix assert

* put the wrapper back

* cleanup; switch to bert-base-cased

* Trigger CI

* Trigger CI
2022-04-14 18:10:05 -07:00
Stas Bekman
ce2fef2ad2 [trainer / deepspeed] fix hyperparameter_search (#16740)
* [trainer / deepspeed] fix hyperparameter_search

* require optuna

* style

* oops

* add dep in the right place

* create deepspeed-testing dep group

* Trigger CI
2022-04-14 17:24:38 -07:00
code-review-doctor
1b7de41a07 Fix issue avoid-missing-comma found at https://codereview.doctor (#16768) 2022-04-14 16:42:27 -04:00
Nicolas Patry
195fbbb6cf Enabling Tapex in table question answering pipeline. (#16663)
* Enabling `Tapex` in table question answering pipeline.

* Questions are independant for Tapex, making the test respect that.

* Missing extra space.
2022-04-14 09:06:14 +02:00
Yih-Dar
6bed0647fe Reduce Funnel PT/TF diff (#16744)
* Make Funnel Test less flaky

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-13 17:19:52 +02:00
davidleonfdez
9f8bfe703c Fix #16660 (tokenizers setters of ids of special tokens) (#16661)
* Fix setters of *_token_id properties of SpecialTokensMixin

* Test setters of common tokens ids

* Move to a separate test checks of setters of tokens ids

* Add independent test for ByT5

* Add Canine test

* Test speech to text
2022-04-13 07:49:06 -04:00
Santiago Castro
f7196f2e63 Fix decoding score comparison when using logits processors or warpers (#10638)
* Normalize using a logits warper

* Add a flag in `generate` to support the logit renormalization

* Add in RAG
2022-04-13 09:37:33 +01:00
Minh Chien Vu
9c9db751e2 add Bigbird ONNX config (#16427)
* add Bigbird ONNX config
2022-04-12 20:46:06 +02:00
Sanchit Gandhi
a960406722 [FlaxWav2Vec2Model] Fix bug in attention mask (#16725)
* [FlaxWav2Vec2Model] Fix bug in attention mask

* more fixes

* add (Flax)SpeechEncoderDecoderModel PT-FX cross-test
2022-04-12 19:48:24 +02:00
Joao Gante
d7f7f29f29 TF: remove set_tensor_by_indices_to_value (#16729) 2022-04-12 17:51:47 +01:00
Nicolas Patry
a192f61e08 Change the chunk_iter function to handle (#16730)
* Change the chunk_iter function to handle

the subtle cases where the last chunk gets ignored since all the
data is in the `left_strided` data.

We need to remove the right striding on the previous item.

* Remove commented line.
2022-04-12 18:25:02 +02:00
Yih-Dar
dce33f2150 Improve PT/TF equivalence test (#16557)
* add error message

* Use names in the error message

* allow ModelOutput

* rename to check_pt_tf_outputs and move outside

* fix style

* skip past_key_values in a better way

* Add comments

* improve code for label/loss

* make the logic clear by moving the ignore keys out

* fix _postprocessing_to_ignore

* fix _postprocessing_to_ignore: create new outputs from the remaining fields

* ignore past_key_values in TFGPT2 models for now

* make check_pt_tf_outputs better regarding names

* move check_pt_tf_models outside

* rename methods

* remove test_pt_tf_model_equivalence in TFCLIPModelTest

* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence

* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models

* Fix quality

* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence

* Fix quality

* fix

* fix style

* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence

* Fix quality

* add docstring

* improve comment

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 22:19:12 +02:00