lewtun
4bb1d0ec84
Skip RoFormer ONNX test if rjieba not installed ( #16981 )
...
* Skip RoFormer ONNX test if rjieba not installed
* Update deps table
* Skip RoFormer serialization test
* Fix RoFormer vocab
* Add rjieba to CircleCI
2022-05-04 10:04:10 +02:00
Sylvain Gugger
1c9fcd0e04
Fix RNG reload in resume training from epoch checkpoint ( #17055 )
...
* Fix RNG reload in resume training from epoch checkpoint
* Fix test
2022-05-03 10:31:24 -04:00
Sylvain Gugger
a8fa2f91f4
Make Trainer compatible with sharded checkpoints ( #17053 )
...
* Make Trainer compatible with sharded checkpoints
* Add doc
2022-05-03 09:55:10 -04:00
Yih-Dar
19420fd99e
Move test model folders ( #17034 )
...
* move test model folders (TODO: fix imports and others)
* fix (potentially partially) imports (in model test modules)
* fix (potentially partially) imports (in tokenization test modules)
* fix (potentially partially) imports (in feature extraction test modules)
* fix import utils.test_modeling_tf_core
* fix path ../fixtures/
* fix imports about generation.test_generation_flax_utils
* fix more imports
* fix fixture path
* fix get_test_dir
* update module_to_test_file
* fix get_tests_dir from wrong transformers.utils
* update config.yml (CircleCI)
* fix style
* remove missing imports
* update new model script
* update check_repo
* update SPECIAL_MODULE_TO_TEST_MAP
* fix style
* add __init__
* update self-scheduled
* fix add_new_model scripts
* check one way to get location back
* python setup.py build install
* fix import in test auto
* update self-scheduled.yml
* update slack notification script
* Add comments about artifact names
* fix for yolos
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-05-03 14:42:02 +02:00
Sanchit Gandhi
cd9274d010
[FlaxBert] Add ForCausalLM ( #16995 )
...
* [FlaxBert] Add ForCausalLM
* make style
* fix output attentions
* Add RobertaForCausalLM
* remove comment
* fix fx-to-pt model loading
* remove comment
* add modeling tests
* add enc-dec model tests
* add big_bird
* add electra
* make style
* make repo-consitency
* add to docs
* remove roberta test
* quality
* amend cookiecutter
* fix attention_mask bug in flax bert model tester
* tighten pt-fx thresholds to 1e-5
* add 'copied from' statements
* amend 'copied from' statements
* amend 'copied from' statements
* quality
2022-05-03 11:26:19 +02:00
Patrick von Platen
31616b8d61
[T5 Tokenizer] Model has no fixed position ids - there is no hardcode… ( #16990 )
...
* [T5 Tokenizer] Model has no fixed position ids - there is no hardcoded max length
* [T5 Tokenizer] Model has no fixed position ids - there is no hardcoded max length
* correct t5 tokenizer
* correct t5 tokenizer
* fix test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* finish
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-05-02 21:27:34 +02:00
NielsRogge
1ac698744c
Add YOLOS ( #16848 )
...
* First draft
* Add YolosForObjectDetection
* Make forward pass work
* Add mid position embeddings
* Add interpolation of position encodings
* Add expected values
* Add YOLOS to tests
* Add integration test
* Support tiny model as well
* Support all models in conversion script
* Remove mid_pe_size attribute
* Make more tests pass
* Add model to README and fix config
* Add copied from statements
* Rename base_model_prefix to vit
* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP
* Apply suggestions from code review
* Apply more suggestions from code review
* Convert remaining checkpoints
* Improve docstrings
* Add YolosFeatureExtractor
* Add feature extractor to docs
* Add corresponding tests
* Fix style
* Fix docs
* Apply suggestion from code review
* Fix bad rebase
* Fix some more bad rebase
* Fix missing character
* Improve docs and variable names
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local >
2022-05-02 18:30:55 +02:00
NielsRogge
2de2c9ecca
Clean up vision tests ( #17024 )
...
* Clean up tests
* Make fixup
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local >
2022-05-02 16:28:58 +02:00
Joao Gante
fb0ae12947
TF: XLA bad words logits processor and list of processors ( #16974 )
2022-04-29 15:54:58 +01:00
Yih-Dar
e952e049b4
use scale=1.0 in floats_tensor called in speech model testers ( #17007 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-29 14:41:33 +02:00
Yih-Dar
5af5735f62
set eos_token_id to None to generate until max length ( #16989 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-28 19:47:38 +02:00
Yih-Dar
49d5bcb0f3
Fix HubertRobustTest PT/TF equivalence test on GPU ( #16943 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-27 10:50:03 +02:00
Krishna Sirumalla
aaee4038c3
Add onnx config for RoFormer ( #16861 )
...
* add roformer onnx config
2022-04-26 16:51:15 +02:00
code-review-doctor
6568752039
Fix issue probably-meant-fstring found at https://codereview.doctor ( #16913 )
2022-04-25 15:15:00 -04:00
Joao Gante
e03966e404
TF: XLA stable softmax ( #16892 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-25 20:10:51 +01:00
Rushi Chaudhari
8246caf3eb
added deit onnx config ( #16887 )
...
* added deit onnx config
2022-04-25 20:50:45 +02:00
Joao Gante
9331b37967
TF: XLA Logits Warpers ( #16899 )
...
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com >
2022-04-25 19:48:08 +01:00
Joao Gante
809dac48f9
TF: XLA logits processors - minimum length, forced eos, and forced bos ( #16912 )
...
* XLA min len, forced eos, and forced bos
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com >
2022-04-25 19:27:53 +01:00
Yih-Dar
32adbb26d6
Fix PyTorch RAG tests GPU OOM ( #16881 )
...
* add torch.cuda.empty_cache in some PT RAG tests
* torch.cuda.empty_cache in tearDownModule()
* tearDown()
* add gc.collect()
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-25 17:33:56 +02:00
Thomas Chaigneau
508baf1943
add bigbird typo fixes ( #16897 )
...
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com >
2022-04-25 11:32:06 +02:00
Joao Gante
99c8226b12
TF: XLA repetition penalty ( #16879 )
2022-04-22 18:29:32 +01:00
Thomas Chaigneau
ec81c11a18
Add OnnxConfig for ConvBERT ( #16859 )
...
* add OnnxConfig for ConvBert
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com >
2022-04-22 18:19:15 +02:00
Joao Gante
6d90d76f5d
TF: rework XLA generate tests ( #16866 )
2022-04-22 12:38:08 +01:00
Sylvain Gugger
cb555af2c7
Return input_ids in ImageGPT feature extractor ( #16872 )
2022-04-21 09:09:00 -04:00
Nicolas Patry
6620f60c0a
Long QuestionAnsweringPipeline fix. ( #16778 )
...
* Temporary commit witht the long QA fix.
* Adding slow tests covering this fix.
* Removing fast test as it doesn't fail anyway.
2022-04-21 09:59:25 +02:00
Nicolas Patry
e13a91fe60
Fixing return type tensor with num_return_sequences>1. ( #16828 )
...
* Fixing return type tensor with `num_return_sequences>1`.
* Nit.
2022-04-20 16:11:51 +02:00
Yang Ming
ff06b17791
add DebertaV2 fast tokenizer ( #15529 )
...
Co-authored-by: alcinos <carion.nicolas@gmail.com >
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com >
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-20 10:26:51 +02:00
Manuel R. Ciosici
3104036e7f
Add support for bitsandbytes ( #15622 )
...
* Add initial BNB integration
* fixup! Add initial BNB integration
* Add bnb test decorator
* Update Adamw8bit option name
* Use the full bnb package name
* Overide bnb for all embedding layers
* Fix package name
* Formatting
* Remove unnecessary import
* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* Rename AdamwBNB optimizer option
* Add training test checking that bnb memory utilization is lower
* fix merge
* fix merge; fix + extend new test
* cleanup
* expand bnb
* move all require_* candidates to testing_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Stas Bekman <stas@stason.org >
2022-04-19 16:01:29 -04:00
Yih-Dar
e6d23a4b9b
Improve test_pt_tf_model_equivalence on PT side ( #16731 )
...
* Update test_pt_tf_model_equivalence on PT side
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-19 21:13:27 +02:00
Joao Gante
f09c45e067
TF: Add sigmoid activation function ( #16819 )
2022-04-19 16:13:08 +01:00
Ella Charlaix
77de8d6c31
Add onnx export of models with a multiple choice classification head ( #16758 )
...
* Add export of models with a multiple-choice classification head
2022-04-19 15:51:51 +02:00
code-review-doctor
a2392415e9
Some tests misusing assertTrue for comparisons fix ( #16771 )
...
* Fix issue avoid-misusing-assert-true found at https://codereview.doctor
* fix tests
* fix tf
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2022-04-19 14:44:08 +02:00
Suraj Patil
d3bd9ac728
[Flax] improve large model init and loading ( #16148 )
...
* begin do_init
* add params_shape_tree
* raise error if params are accessed when do_init is False
* don't allow do_init=False when keys are missing
* make shape tree a property
* assign self._params at the end
* add test for do_init
* add do_init arg to all flax models
* fix param setting
* disbale do_init for composite models
* update test
* add do_init in FlaxBigBirdForMultipleChoice
* better names and errors
* improve test
* style
* add a warning when do_init=False
* remove extra if
* set params after _required_params
* add test for from_pretrained
* do_init => _do_init
* chage warning to info
* fix typo
* add params in init_weights
* add params to gpt neo init
* add params to init_weights
* update do_init test
* Trigger CI
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* update template
* trigger CI
* style
* style
* fix template
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2022-04-19 14:19:55 +02:00
NielsRogge
494c2a8c4d
Clean up semantic segmentation tests ( #16801 )
...
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local >
2022-04-19 09:02:19 +02:00
jsnfly
51e0ebedcb
Allow passing encoder_ouputs as tuple to EncoderDecoder Models ( #16814 )
...
* Add passing encoder_outputs as tuple to existing test
* Add check for tuple
* Add check for tuple also for speech and vision
Co-authored-by: jsnfly <jsnfly@gmx.de >
2022-04-18 19:49:58 +02:00
Patrick von Platen
8d3f952adb
[Data2Vec] Add data2vec vision ( #16760 )
...
* save intermediate
* add vision
* add vision
* save
* finish models
* finish models
* continue
* finish
* up
* up
* up
* tests all pass
* clean up
* up
* up
* fix bugs in beit
* correct docs
* finish
* finish docs
* make style
* up
* more fixes
* fix type hint
* make style
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update tests/data2vec/test_modeling_data2vec_vision.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
* fix test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-18 17:52:13 +02:00
NielsRogge
d3c9d0e55f
[ViT, BEiT, DeiT, DPT] Improve code ( #16799 )
...
* Improve code
* Fix bugs
* Fix another bug
* Clean up DTP as well
* Update DPT model outputs
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local >
2022-04-18 09:25:08 -04:00
Joao Gante
b4ddd2677c
TF generate refactor - XLA sample ( #16713 )
2022-04-18 10:58:24 +01:00
Stas Bekman
5da33f8729
[modeling utils] revamp from_pretrained(..., low_cpu_mem_usage=True) + tests ( #16657 )
...
* add low_cpu_mem_usage tests
* wip: revamping
* wip
* install /usr/bin/time
* wip
* cleanup
* cleanup
* cleanup
* cleanup
* cleanup
* fix assert
* put the wrapper back
* cleanup; switch to bert-base-cased
* Trigger CI
* Trigger CI
2022-04-14 18:10:05 -07:00
Stas Bekman
ce2fef2ad2
[trainer / deepspeed] fix hyperparameter_search ( #16740 )
...
* [trainer / deepspeed] fix hyperparameter_search
* require optuna
* style
* oops
* add dep in the right place
* create deepspeed-testing dep group
* Trigger CI
2022-04-14 17:24:38 -07:00
code-review-doctor
1b7de41a07
Fix issue avoid-missing-comma found at https://codereview.doctor ( #16768 )
2022-04-14 16:42:27 -04:00
Nicolas Patry
195fbbb6cf
Enabling Tapex in table question answering pipeline. ( #16663 )
...
* Enabling `Tapex` in table question answering pipeline.
* Questions are independant for Tapex, making the test respect that.
* Missing extra space.
2022-04-14 09:06:14 +02:00
Yih-Dar
6bed0647fe
Reduce Funnel PT/TF diff ( #16744 )
...
* Make Funnel Test less flaky
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-13 17:19:52 +02:00
davidleonfdez
9f8bfe703c
Fix #16660 (tokenizers setters of ids of special tokens) ( #16661 )
...
* Fix setters of *_token_id properties of SpecialTokensMixin
* Test setters of common tokens ids
* Move to a separate test checks of setters of tokens ids
* Add independent test for ByT5
* Add Canine test
* Test speech to text
2022-04-13 07:49:06 -04:00
Santiago Castro
f7196f2e63
Fix decoding score comparison when using logits processors or warpers ( #10638 )
...
* Normalize using a logits warper
* Add a flag in `generate` to support the logit renormalization
* Add in RAG
2022-04-13 09:37:33 +01:00
Minh Chien Vu
9c9db751e2
add Bigbird ONNX config ( #16427 )
...
* add Bigbird ONNX config
2022-04-12 20:46:06 +02:00
Sanchit Gandhi
a960406722
[FlaxWav2Vec2Model] Fix bug in attention mask ( #16725 )
...
* [FlaxWav2Vec2Model] Fix bug in attention mask
* more fixes
* add (Flax)SpeechEncoderDecoderModel PT-FX cross-test
2022-04-12 19:48:24 +02:00
Joao Gante
d7f7f29f29
TF: remove set_tensor_by_indices_to_value ( #16729 )
2022-04-12 17:51:47 +01:00
Nicolas Patry
a192f61e08
Change the chunk_iter function to handle ( #16730 )
...
* Change the chunk_iter function to handle
the subtle cases where the last chunk gets ignored since all the
data is in the `left_strided` data.
We need to remove the right striding on the previous item.
* Remove commented line.
2022-04-12 18:25:02 +02:00
Yih-Dar
dce33f2150
Improve PT/TF equivalence test ( #16557 )
...
* add error message
* Use names in the error message
* allow ModelOutput
* rename to check_pt_tf_outputs and move outside
* fix style
* skip past_key_values in a better way
* Add comments
* improve code for label/loss
* make the logic clear by moving the ignore keys out
* fix _postprocessing_to_ignore
* fix _postprocessing_to_ignore: create new outputs from the remaining fields
* ignore past_key_values in TFGPT2 models for now
* make check_pt_tf_outputs better regarding names
* move check_pt_tf_models outside
* rename methods
* remove test_pt_tf_model_equivalence in TFCLIPModelTest
* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence
* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models
* Fix quality
* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence
* Fix quality
* fix
* fix style
* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence
* Fix quality
* add docstring
* improve comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-11 22:19:12 +02:00