Commit Graph

9573 Commits

Author SHA1 Message Date
Joao Gante
6984848ed0 Create empty venv on cache miss (#16816) 2022-04-18 07:49:31 -04:00
Allan Jie
438144832e Raise error and suggestion when using custom optimizer with Fairscale or Deepspeed (#16786)
* optimizer issues related to saving

* remove the "optimizer saving" option

* reformat using make style
2022-04-18 07:47:21 -04:00
Joao Gante
b4ddd2677c TF generate refactor - XLA sample (#16713) 2022-04-18 10:58:24 +01:00
Joao Gante
02de7a8e7f CI: non-remote GH Actions now use a python venv (#16789) 2022-04-18 09:47:38 +01:00
Sylvain Gugger
dee6f01636 Pin Jax to last working release (#16808)
* Pin Jax to last working release

* Try lower

* Try lower
2022-04-16 21:15:19 -04:00
NielsRogge
78f346c2b5 Update README.md (#16797) 2022-04-15 14:10:16 +02:00
Yih-Dar
ee209d4d01 Fix PT TF ViTMAE (#16766)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-15 06:37:10 +02:00
Stas Bekman
5da33f8729 [modeling utils] revamp from_pretrained(..., low_cpu_mem_usage=True) + tests (#16657)
* add low_cpu_mem_usage tests

* wip: revamping

* wip

* install /usr/bin/time

* wip

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* fix assert

* put the wrapper back

* cleanup; switch to bert-base-cased

* Trigger CI

* Trigger CI
2022-04-14 18:10:05 -07:00
Stas Bekman
ce2fef2ad2 [trainer / deepspeed] fix hyperparameter_search (#16740)
* [trainer / deepspeed] fix hyperparameter_search

* require optuna

* style

* oops

* add dep in the right place

* create deepspeed-testing dep group

* Trigger CI
2022-04-14 17:24:38 -07:00
code-review-doctor
1b7de41a07 Fix issue avoid-missing-comma found at https://codereview.doctor (#16768) 2022-04-14 16:42:27 -04:00
Sanchit Gandhi
de8b06f9bf [SpeechEncoderDecoderModel] Fix bug in reshaping labels (#16748) 2022-04-14 19:02:40 +01:00
NielsRogge
048443db86 Improve image classification example (#16585)
* Improve README

* Make dataset_name argument optional

* Improve local data

* Fix bug

* Improve README some more

* Apply suggestions from code review

* Improve README

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-14 18:10:52 +02:00
Sylvain Gugger
3e4eec47f5 Kill async pushes when calling push_to_hub with blocking=True (#16755) 2022-04-14 10:02:29 -04:00
Stas Bekman
c21e1071a7 [deepspeed / m2m_100] make deepspeed zero-3 work with layerdrop (#16717)
* [deepspeed / m2m_100] make deepspeed 3 work with layerdrop

* fix

* revert last
2022-04-14 06:51:55 -07:00
Zachary Mueller
89293a0f6b Make nightly install dev accelerate (#16783) 2022-04-14 09:41:02 -04:00
Sylvain Gugger
b151ddb9b9 Fix batch size in evaluation loop (#16763)
* Fix batch size in evaluation loop

* remove debug statement
2022-04-14 09:22:54 -04:00
Sanchit Gandhi
d8269eb4d5 [Flax .from_pretrained] Raise a warning if model weights are not in float32 (#16762)
* [Flax] Raise a warning if model weights are not in float32

* apply suggestions and few small changes

* reorder wording for better readability
2022-04-14 11:52:15 +02:00
Nicolas Patry
195fbbb6cf Enabling Tapex in table question answering pipeline. (#16663)
* Enabling `Tapex` in table question answering pipeline.

* Questions are independant for Tapex, making the test respect that.

* Missing extra space.
2022-04-14 09:06:14 +02:00
Bhadresh Savani
442dc45645 [Doctest] added doctest changes for electra (#16675)
* added doctest changes for electra

* fixed doctest tests

* updated changes
2022-04-13 22:39:00 +02:00
Zachary Mueller
be752d12f8 Fixup no_trainer examples scripts and add more tests (#16765)
* Change tracking to store_true

* Remove step param and use it in the log dictionary directly

* use vars(args) when passing args to init_trackers

* Include tracking tests since tensorboard is already a dep
2022-04-13 14:40:48 -04:00
Stas Bekman
3a16ab25c8 [self-scheduled ci] explain where dependencies are (#16757) 2022-04-13 12:28:02 -04:00
Tu Vu
34ef029dc0 Add self training code for text classification (#16738)
* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Delete strata
2022-04-13 12:03:24 -04:00
Sylvain Gugger
8e0d3b427f Add defensive check for config num_labels and id2label (#16709)
* Add defensive check for config num_labels and id2label

* Actually check value...

* Only warning inside init plus better error message
2022-04-13 11:28:19 -04:00
Yih-Dar
6bed0647fe Reduce Funnel PT/TF diff (#16744)
* Make Funnel Test less flaky

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-13 17:19:52 +02:00
Joao Gante
0b8f697219 CI: setup-dependent pip cache (#16751)
* Setup-dependent pip cache

* Do not restore from old versions
2022-04-13 16:19:14 +01:00
Stas Bekman
ac43a40e6a [modeling_utils] better explanation of ignore keys (#16741) 2022-04-13 08:03:20 -07:00
Jeremy Fisher
0235bc57ab Fix and improve CTRL doctests (#16573)
* Improve CTRL doctests

* Fix `CTRLForSequenceClassification` flakiness with inconsistent losses

* Remove unused

* Fixup

* Add CTRL to documentation_tests.txt

* Fix control code not being first

* Add output assertions

* Change from sshleifer/tiny-ctrl -> ctrl

* Run `make fixup`

* apply `list` to output logits shape for clarity

* Reduce output loss precision to make assertion more robust

* Add assertion of control code being first

* Fix docstyle

* upper case sentence following control code

* Weird bug fixes

* Add a better generation example

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-04-13 15:44:31 +02:00
Michael Chung
06b4aac9eb Add Doc Test for GPT-J (#16507)
* Required the values GPTJ unfortunately cannot run the model =)

* Added the file to the doc tests

* Run Fixup and Style

* Fixed with the test versions of gptj. Ran Style and Fixup.

* Trigger ci

* A Minor Change to License

* Fixed spacing added to the benchmark_utils. Then refactored tests to const variables.

* Removed strings that were included as default parameters anyways.

Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>
2022-04-13 15:04:47 +02:00
Stas Bekman
12bfa97a43 [from_pretrained] refactor find_mismatched_keys (#16706) 2022-04-13 07:50:15 -04:00
davidleonfdez
9f8bfe703c Fix #16660 (tokenizers setters of ids of special tokens) (#16661)
* Fix setters of *_token_id properties of SpecialTokensMixin

* Test setters of common tokens ids

* Move to a separate test checks of setters of tokens ids

* Add independent test for ByT5

* Add Canine test

* Test speech to text
2022-04-13 07:49:06 -04:00
Patrick von Platen
b24201fa44 [Doctests] Fix all T5 doc tests (#16646)
* [Doctests] Fix all T5 doc tests

* make style

* Update docs/source/en/model_doc/t5.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply Sylvains comments

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-13 11:36:54 +02:00
Santiago Castro
f7196f2e63 Fix decoding score comparison when using logits processors or warpers (#10638)
* Normalize using a logits warper

* Add a flag in `generate` to support the logit renormalization

* Add in RAG
2022-04-13 09:37:33 +01:00
Joao Gante
eb5bdcdfa5 TF generate: handle case without cache in beam search (#16704) 2022-04-12 20:46:10 +01:00
Minh Chien Vu
9c9db751e2 add Bigbird ONNX config (#16427)
* add Bigbird ONNX config
2022-04-12 20:46:06 +02:00
Sanchit Gandhi
a960406722 [FlaxWav2Vec2Model] Fix bug in attention mask (#16725)
* [FlaxWav2Vec2Model] Fix bug in attention mask

* more fixes

* add (Flax)SpeechEncoderDecoderModel PT-FX cross-test
2022-04-12 19:48:24 +02:00
Sanchit Gandhi
6adefba3f0 [FlaxSpeechEncoderDecoder] Fix input shape bug in weights init (#16728)
* [FlaxSpeechEncoderDecoder] Fix input shape bug in weights init

* make style
2022-04-12 19:33:57 +02:00
hiromu
1bac40db8a Add Doc Tests for Reformer PyTorch (#16565)
* start working

* fix: ReformerForQA doctest

* fix: ReformerModelWithLMHead doctest

* fix: ReformerModelForSC doctest

* fix: ReformerModelForMLM doctest

* add: documentation_tests.txt

* make fixup

* change: ReformerModelForSC doctest

* change: checkpoint
2022-04-12 18:52:31 +02:00
Joao Gante
d7f7f29f29 TF: remove set_tensor_by_indices_to_value (#16729) 2022-04-12 17:51:47 +01:00
Anmol Joshi
a315988bae Moved functions to pytorch_utils.py (#16625)
* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix
2022-04-12 12:38:50 -04:00
Sylvain Gugger
0711c45eae Remove duplicate header (#16732) 2022-04-12 12:37:13 -04:00
Nicolas Patry
a192f61e08 Change the chunk_iter function to handle (#16730)
* Change the chunk_iter function to handle

the subtle cases where the last chunk gets ignored since all the
data is in the `left_strided` data.

We need to remove the right striding on the previous item.

* Remove commented line.
2022-04-12 18:25:02 +02:00
Anmol Joshi
cc034f72eb Replace assertion with exception (#16720)
* Updated assertions to exceptions

* updated assertions to exceptions

* bug fixes

* fix-copies

* Update modeling_ctrl.py

* Update src/transformers/models/ctrl/modeling_tf_ctrl.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gptj/modeling_tf_gptj.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update modeling_led.py

* Update modeling_led.py

* Update modeling_led.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-12 11:47:01 -04:00
Shang Zhang
14daa6102a Qdqbert example add benchmark script with ORT-TRT (#16592)
* add ort-trt benchmark script

* Update README.md

* ort version can be newer

* formatting

* specify ORT version
2022-04-12 11:13:59 -04:00
Heerak Son
db3edd050b Update run_translation_no_trainer.py (#16652)
args.model_name_or_path -> args.config_name
fix it
2022-04-12 08:55:12 -04:00
smelm
b9f12bedd3 Only call get_output_embeddings when tie_word_embeddings is set (#16667)
This avoids an unnecessary call and avoids problems during
initialization of class hierarchies.

Co-authored-by: Samuel Melm <samuel.melm@stud.uni-heidelberg.de>
2022-04-12 07:55:44 -04:00
Michael Chung
924484ee4a Add Doc Test GPT-2 (#16439)
* First Pass All Tests Pass

* WIP

* Adding file to documentation tests

* Change the base model for the example in the doc test.

* Fix Code Styling by running
make fixup

* Called Style

* Reverted to gpt2 model rather than distill gpt2
Then used a token classification model over a sequence model for an example.

* Fix Styling Issue

* Hopefully ignores the formatting issue.

Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>
2022-04-12 12:11:03 +02:00
Patrick von Platen
70851a6bf0 [Bart] correct doc test (#16722) 2022-04-12 10:19:49 +02:00
Zachary Mueller
69233cf03b Fix example logs repeating themselves (#16669)
Move declaration of log streams to before tests, so that results won't get compounded on top of each other
2022-04-11 16:25:16 -04:00
Yih-Dar
dce33f2150 Improve PT/TF equivalence test (#16557)
* add error message

* Use names in the error message

* allow ModelOutput

* rename to check_pt_tf_outputs and move outside

* fix style

* skip past_key_values in a better way

* Add comments

* improve code for label/loss

* make the logic clear by moving the ignore keys out

* fix _postprocessing_to_ignore

* fix _postprocessing_to_ignore: create new outputs from the remaining fields

* ignore past_key_values in TFGPT2 models for now

* make check_pt_tf_outputs better regarding names

* move check_pt_tf_models outside

* rename methods

* remove test_pt_tf_model_equivalence in TFCLIPModelTest

* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence

* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models

* Fix quality

* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence

* Fix quality

* fix

* fix style

* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence

* Fix quality

* add docstring

* improve comment

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 22:19:12 +02:00
Yih-Dar
7f7300856d Handle image_embeds in ViltModel (#16696)
* update

* batch_size -> text_batch_size

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 22:16:20 +02:00