NielsRogge
479fdc4925
Add semantic script, trainer ( #16834 )
...
* Add first draft
* Improve script and README
* Improve README
* Apply suggestions from code review
* Improve script, add link to resulting model
* Add corresponding test
* Adjust learning rate
2022-04-27 10:12:18 +02:00
Anton Lozhkov
a4a88fa09f
[Research] Speed up evaluation for XTREME-S ( #16785 )
...
* Avoid repeated per-lang filtering
* Language groups and logits preprocessing
* Style
2022-04-27 08:34:21 +02:00
Yongliang Shen
2d91e3c304
use original loaded keys to find mismatched keys ( #16920 )
2022-04-26 17:29:52 -04:00
nikkie
d365f5074f
Fix RuntimeError message format ( #16906 )
2022-04-26 17:08:28 -04:00
Yang Ming
10dfa126b7
documentation: some minor clean up ( #16850 )
2022-04-26 16:56:08 -04:00
Krishna Sirumalla
aaee4038c3
Add onnx config for RoFormer ( #16861 )
...
* add roformer onnx config
2022-04-26 16:51:15 +02:00
Ahmed Elnaggar
8afaaa26f5
FIx Iterations for decoder ( #16934 )
...
FIx Iterations for decoder
2022-04-26 12:54:14 +02:00
Manuel
fa32247406
apply torch int div to layoutlmv2 ( #15457 )
...
* apply torch int div
* black linting fixup
* update path to torch_int_div
* clarify imports
2022-04-26 10:07:51 +02:00
Sylvain Gugger
344b9fb0c6
Limit the use of PreTrainedModel.device ( #16935 )
...
* Limit the use of PreTrainedModel.device
* Fix
2022-04-25 20:58:50 -04:00
code-review-doctor
6568752039
Fix issue probably-meant-fstring found at https://codereview.doctor ( #16913 )
2022-04-25 15:15:00 -04:00
Sanchit Gandhi
fea94d6790
Replace deprecated logger.warn with warning ( #16876 )
2022-04-25 15:12:51 -04:00
Joao Gante
e03966e404
TF: XLA stable softmax ( #16892 )
...
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-25 20:10:51 +01:00
Rushi Chaudhari
8246caf3eb
added deit onnx config ( #16887 )
...
* added deit onnx config
2022-04-25 20:50:45 +02:00
Joao Gante
9331b37967
TF: XLA Logits Warpers ( #16899 )
...
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com >
2022-04-25 19:48:08 +01:00
Joao Gante
809dac48f9
TF: XLA logits processors - minimum length, forced eos, and forced bos ( #16912 )
...
* XLA min len, forced eos, and forced bos
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com >
2022-04-25 19:27:53 +01:00
Yih-Dar
f6210c49e2
Fix RemBertTokenizerFast ( #16933 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-25 19:51:50 +02:00
Yih-Dar
32adbb26d6
Fix PyTorch RAG tests GPU OOM ( #16881 )
...
* add torch.cuda.empty_cache in some PT RAG tests
* torch.cuda.empty_cache in tearDownModule()
* tearDown()
* add gc.collect()
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-25 17:33:56 +02:00
Yih-Dar
3e47d19cfc
Add missing ckpt in config docs ( #16900 )
...
* add missing ckpt in config docs
* add more missing ckpt in config docs
* fix wrong ckpts
* fix realm ckpt
* fix s2t2
* fix xlm_roberta ckpt
* Fix for deberta v2
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* use only one checkpoint for DPR
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
2022-04-25 17:31:45 +02:00
Patrick von Platen
3a71e94a92
Fix doc test quicktour dataset ( #16929 )
...
* fix doc test
* fix doc test
Co-authored-by: Patrick <patrick@pop-os.localdomain >
2022-04-25 16:26:59 +02:00
Thomas Chaigneau
508baf1943
add bigbird typo fixes ( #16897 )
...
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com >
2022-04-25 11:32:06 +02:00
Patrick von Platen
72728be3db
[DocTests] Fix some doc tests ( #16889 )
...
* [DocTests] Fix some doc tests
* hacky fix
* correct
2022-04-23 08:40:14 +02:00
cavdard
22fc93c4d9
Changes in create_optimizer to support tensor parallelism with SMP ( #16880 )
...
* changes in create optimizer to support tensor parallelism with SMP
* Update src/transformers/trainer.py
Convert if check to one line.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-22 15:24:38 -04:00
Joao Gante
99c8226b12
TF: XLA repetition penalty ( #16879 )
2022-04-22 18:29:32 +01:00
Thomas Chaigneau
ec81c11a18
Add OnnxConfig for ConvBERT ( #16859 )
...
* add OnnxConfig for ConvBert
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com >
2022-04-22 18:19:15 +02:00
Minh Chien Vu
0d1cff1195
Add doc tests for Albert and Bigbird ( #16774 )
...
* Add doctest BERT
* make fixup
* fix typo
* change checkpoints
* make fixup
* define doctest output value, update doctest for mobilebert
* solve fix-copies
* update QA target start index and end index
* change checkpoint for docs and reuse defined variable
* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
* make fixup
* Add Doctest for Albert and Bigbird
* make fixup
* overwrite examples for Albert and Bigbird
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* update longer examples for Bigbird
* using examples from squad_v2
* print out example text
* change name token-classification-big-bird checkpoint to random
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2022-04-22 18:07:16 +02:00
Mario Šaško
9fa88172c2
Minor fixes/improvements in convert_file_size_to_int ( #16891 )
...
* Minor improvements to `convert_file_size_to_int`
* Add <unit>bit version to kilos and megas
* Minor fix
2022-04-22 16:54:20 +02:00
Joao Gante
6d90d76f5d
TF: rework XLA generate tests ( #16866 )
2022-04-22 12:38:08 +01:00
Yih-Dar
3b1bbefc47
Add missing entries in mappings ( #16857 )
...
* add missing entries in some mappings
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-22 10:53:24 +02:00
Loubna Ben Allal
d91841315a
New features for CodeParrot training script ( #16851 )
...
* add tflops logging and fix grad accumulation
* add accelerate tracking and checkpointing
* scale loss of last batch correctly
* fix typo
* compress loss computation
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com >
* add resume from checkpoint argument
* add load_state accelerate from checkpoint, register lr scheduler and add tflops function
* reformat code
* reformat code
* add condition on path for resume checkpoint
* combine if conditions
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com >
* add source for tflops formula
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com >
2022-04-21 18:43:46 +02:00
Yih-Dar
eef2422e96
Fix doctest list ( #16878 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-21 18:12:14 +02:00
Thomas Chaigneau
0b1e0fcf7a
Fix GPT-J onnx conversion ( #16780 )
...
* add gptj to TOKENIZER_MAPPING_NAMES
* fix int32 to float to avoid problem in onnx
* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com >
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
2022-04-21 15:55:30 +02:00
Eldar Kurtic
bae9b6458c
Use ACT2FN to fetch ReLU activation ( #16874 )
...
- all activations should be fetched through ACT2FN
- it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`
2022-04-21 09:33:29 -04:00
Sylvain Gugger
cb555af2c7
Return input_ids in ImageGPT feature extractor ( #16872 )
2022-04-21 09:09:00 -04:00
Nicolas Patry
e789418ebe
Adding support for array key in raw dictionnaries in ASR pipeline. ( #16827 )
...
* Adding support for `array` key in raw dictionnaries in ASR pipeline.
* ES .
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Making it work by not popping `array` first.
* Black 22.3
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-21 14:39:10 +02:00
ghlai9665
daf520b033
tiny tweak to allow BatchEncoding.token_to_char when token doesn't correspond to chars ( #15901 )
...
* tweak to allow BatchEncoding.char_to_token(0)
* update docstring
* remote trailing whitespace
* make fixup
* make value checking for span_indices explicit
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-21 08:07:54 -04:00
Stefan Schweter
cb7e166428
t5: add conversion script for T5X to FLAX ( #16853 )
...
* t5: add conversion script for T5X to FLAX
* t5: make flake happy
* t5: add copyright message to t5x conversion script
* t5: fix lm head for v1.0 checkpoints
2022-04-21 13:00:35 +02:00
Nicolas Patry
6620f60c0a
Long QuestionAnsweringPipeline fix. ( #16778 )
...
* Temporary commit witht the long QA fix.
* Adding slow tests covering this fix.
* Removing fast test as it doesn't fail anyway.
2022-04-21 09:59:25 +02:00
Zachary Mueller
705d65368f
Fix multiproc metrics in no_trainer examples ( #16865 )
2022-04-20 17:26:27 -04:00
Sylvain Gugger
175da8d182
Fix custom init sorting script ( #16864 )
2022-04-20 17:05:39 -04:00
Stas Bekman
67ed0e43dc
[docs] fix url ( #16860 )
2022-04-20 11:01:24 -07:00
Stas Bekman
afa1ef0992
[modeling_utils] use less cpu memory with sharded checkpoint loading ( #16844 )
...
* less cpu memory with sharded checkpoint loading
* Trigger CI
* Trigger CI
2022-04-20 07:44:37 -07:00
Nicolas Patry
e13a91fe60
Fixing return type tensor with num_return_sequences>1. ( #16828 )
...
* Fixing return type tensor with `num_return_sequences>1`.
* Nit.
2022-04-20 16:11:51 +02:00
Yang Ming
ff06b17791
add DebertaV2 fast tokenizer ( #15529 )
...
Co-authored-by: alcinos <carion.nicolas@gmail.com >
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com >
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2022-04-20 10:26:51 +02:00
Patrick von Platen
e1c153cbaa
[Typo] Fix typo in modeling utils ( #16840 )
2022-04-19 23:09:03 +02:00
Manuel R. Ciosici
3104036e7f
Add support for bitsandbytes ( #15622 )
...
* Add initial BNB integration
* fixup! Add initial BNB integration
* Add bnb test decorator
* Update Adamw8bit option name
* Use the full bnb package name
* Overide bnb for all embedding layers
* Fix package name
* Formatting
* Remove unnecessary import
* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* Rename AdamwBNB optimizer option
* Add training test checking that bnb memory utilization is lower
* fix merge
* fix merge; fix + extend new test
* cleanup
* expand bnb
* move all require_* candidates to testing_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Stas Bekman <stas@stason.org >
2022-04-19 16:01:29 -04:00
Yih-Dar
e6d23a4b9b
Improve test_pt_tf_model_equivalence on PT side ( #16731 )
...
* Update test_pt_tf_model_equivalence on PT side
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2022-04-19 21:13:27 +02:00
Dahlbomii
3dd57b15c5
Type hints added to Speech to Text ( #16506 )
...
* Type hints added
* return hints added
* Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com >
2022-04-19 17:58:08 +01:00
SaulLu
1efca4e6c8
replace Speech2TextTokenizer by Speech2TextFeatureExtractor in some docstrings ( #16835 )
...
* replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in docstring
* quality
2022-04-19 18:32:22 +02:00
Jeevesh Juneja
b5c6a63ed9
Correct Logging of Eval metric to Tensorboard ( #16825 )
...
* Correct Logging of Eval metric to Tensorboard
An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``.
* Remove unused variable
2022-04-19 17:27:54 +02:00
Joao Gante
f09c45e067
TF: Add sigmoid activation function ( #16819 )
2022-04-19 16:13:08 +01:00