Yoach Lacombe
704bf595eb
Update Bark generation configs and tests ( #25409 )
...
* update bark generation configs for more coherent parameter
* make style
* update bark hub repo
2023-08-09 18:28:02 +02:00
Yih-Dar
5b517e1764
Use small config for OneFormerModelTest.test_model_with_labels ( #25383 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-08 17:15:34 +02:00
Sanchit Gandhi
dedd11160d
[ASR Pipeline] Clarify return timestamps ( #25344 )
...
* [ASR Pipeline] Clarify return timestamps
* fix indentation
* fix ctc check
* fix ctc error message!
* fix test
* fix other test
* add new tests
* final comment
2023-08-08 10:16:00 +01:00
Yih-Dar
6ea3ee3cd2
Fix test_model_parallelism ( #25359 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-08 10:48:45 +02:00
Matthew Hoffman
d4bd33cc9f
Register ModelOutput subclasses as supported torch.utils._pytree nodes ( #25358 )
...
* Register ModelOutput subclasses as supported torch.utils._pytree nodes
Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses
* Add test for torch pytree ModelOutput serialization and deserialization
2023-08-08 08:12:11 +02:00
Pedro Lira
080a97119c
Add mask2former fp16 support ( #25093 )
...
* Add mask2former fp16 support
* Clear consistency/quality issues
* Fix consistency/quality (2)
* Add integration test for mask2former (fp16 case)
* Fix code quality
* Add integration test for maskformer (fp16 case)
* Add integration test for oneformer (fp16 case)
* Remove slow decorator from fp16 tests
* Fix lint
* Remove usage of full inference and value checks for fp16
* Temporarily comment slow for {mask, mask2, one}former
* Add fp16 support to oneformer
* Revert "Temporarily comment slow for {mask, mask2, one}former"
This reverts commit e5371edabd301cf56079def0421a0a87df307cb0.
* Remove dtype conversion noop
2023-08-07 20:07:29 +01:00
Sylvain Gugger
baf1daa58e
Migrate Trainer from Repository to upload_folder ( #25095 )
...
* First draft
* Deal with progress bars
* Update src/transformers/utils/hub.py
Co-authored-by: Lucain <lucainp@gmail.com >
* Address review comments
* Forgot one
* Pin hf_hub
* Add argument for push all and fix tests
* Fix tests
* Address review comments
---------
Co-authored-by: Lucain <lucainp@gmail.com >
2023-08-07 17:47:22 +02:00
Yih-Dar
c177606fb4
Fix more offload edge cases ( #25342 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-07 17:45:41 +02:00
Guillaume "Vermeille" Sanchez
d533465150
add CFG for .generate() ( #24654 )
2023-08-06 20:15:24 +01:00
Yih-Dar
ce6d153a53
Make bark could have tiny model ( #25290 )
...
* temp
* update
* update
* update
* small dim
* small dim
* small dim
* fix
* update
* fix
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-04 15:13:14 +02:00
Sylvain Gugger
f0fd73a2de
Document check copies ( #25291 )
...
* Document check copies better and add tests
* Include header in check for copies
* Manual fixes
* Try autofix
* Fixes
* Clean tests
* Finalize doc
* Remove debug print
* More fixes
2023-08-04 14:56:29 +02:00
Sylvain Gugger
29f04002e6
Deal with nested configs better in base class ( #25237 )
...
* Deal better with nested configs
* Fixes
* More fixes
* Fix last test
* Clean up existing configs
* Remove hack in MPT Config
* Update src/transformers/configuration_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* Fix setting a nested config via dict in the kwargs
* Adapt common test
* Add test for nested config load with dict
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2023-08-04 14:56:09 +02:00
Sylvain Gugger
fab1a0aa82
Give more memory in test_disk_offload ( #25315 )
2023-08-04 14:10:31 +02:00
Roland Szabo
d114a6b71f
Add timeout parameter to load_image function ( #25184 )
...
* Add timeout parameter to load_image function.
* Remove line.
* Reformat code
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Add parameter to docs.
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-08-03 15:51:54 +01:00
Yoach Lacombe
6d3f9c1e2e
add generate method to SpeechT5ForTextToSpeech ( #25233 )
...
* add generate method to SpeechT5ForTextToSpeech
* update speecht5forTTS docstrings
* Remove defaults to None in generate docstrings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-08-03 14:12:07 +01:00
amyeroberts
30409af6e1
Update InstructBLIP & Align values after rescale update ( #25209 )
...
* Update InstructBLIP values
Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests
* Update test values after rescale update
* Remove left over commented out code
* Revert to previous rescaling logic
* Update rescale tests
2023-08-03 11:01:10 +01:00
Yih-Dar
bd90cda9a6
CI with num_hidden_layers=2 🚀 🚀 🚀 ( #25266 )
...
* CI with layers=2
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-02 20:22:36 +02:00
Patrick von Platen
b28ebb2655
[MMS] Fix mms ( #25267 )
...
* [MMS] Fix mms
* [MMS] Fix mms
* fix mms loading
* Apply suggestions from code review
* make style
* Update tests/models/wav2vec2/test_modeling_wav2vec2.py
2023-08-02 18:11:15 +02:00
Yupeng Jia
8021c684ec
Fix some bugs for two stage training of deformable detr ( #25045 )
...
* Update modeling_deformable_detr.py
Fix bugs for two stage training
* Update modeling_deformable_detr.py
* Add test_two_stage_training to DeformableDetrModelTest
---------
Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai >
2023-08-02 11:30:36 +01:00
amyeroberts
1b35409768
Update rescale tests - cast to float after rescaling to reflect #25229 ( #25259 )
...
Rescale tests - cast to float after rescaling to reflect #25229
2023-08-02 11:29:55 +01:00
YQ
2230d149f0
fix get_keys_to_not_convert() to return correct modules for full precision inference ( #25105 )
...
* add test for `get_keys_to_not_convert`
* add minimum patch to keep mpt lm_head from 8bit quantization
* add reivsion to
2023-08-02 04:21:52 -04:00
Younes Belkada
05ebb0264e
[MPT] Add require_bitsandbytes on MPT integration tests ( #25201 )
...
* add `require_bitsandbytes` on MPT integration tests
* add it on mpt as well
2023-08-01 12:20:34 +02:00
Yih-Dar
1b4f6199c6
Update tiny model info. and pipeline testing ( #25213 )
...
* update tiny_model_summary.json
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-07-31 19:35:33 +02:00
Yih-Dar
9ca3aa0156
Fix all_model_classes in FlaxBloomGenerationTest ( #25211 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-07-31 17:32:05 +02:00
amyeroberts
05cda5df34
🚨 🚨 🚨 Fix rescale ViVit Efficientnet ( #25174 )
...
* Fix rescaling bug
* Add tests
* Update integration tests
* Fix up
* Update src/transformers/image_transforms.py
* Update test - new possible order in list
2023-07-28 19:52:51 +01:00
Sanchit Gandhi
03f98f9683
[MusicGen] Fix integration tests ( #25169 )
...
* move to device
* update with cuda values
* fix fp16
* more rigorous
2023-07-28 18:50:15 +01:00
Younes Belkada
dd9d45b6ec
[InstructBlip] Fix instructblip slow test ( #25171 )
...
* fix instruct blip slow test
* Update tests/models/instructblip/test_modeling_instructblip.py
2023-07-28 17:00:10 +02:00
Younes Belkada
add0895dd9
[Mpt] Fix mpt slow test ( #25170 )
...
fix mpt slow test
2023-07-28 16:45:09 +02:00
Lucain
c1dba1111b
Add test when downloading from gated repo ( #25039 )
2023-07-28 08:14:27 -04:00
Sanchit Gandhi
e93103632b
Add bloom flax ( #25094 )
...
* First commit
* step 1 working
* add alibi
* placeholder for `scan`
* add matrix mult alibi
* beta scaling factor for bmm
* working v1 - simple forward pass
* move layer_number from attribute to arg in call
* partial functioning scan
* hacky working scan
* add more modifs
* add test
* update scan for new kwarg order
* fix position_ids problem
* fix bug in attention layer
* small fix
- do the alibi broadcasting only once
* prelim refactor
* finish refactor
* alibi shifting
* incorporate dropout_add to attention module
* make style
* make padding work again
* update
* remove bogus file
* up
* get generation to work
* clean code a bit
* added small tests
* adding albii test
* make CI tests pass:
- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work
* fix few nits
* fix nit onnx
* fix onnx nit
* add missing dtype args to nn.Modules
* remove debugging statements
* fix scan generate
* Update modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* fix small test issue + make style
* clean up
* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* fix function name
* small fix test
* forward contrib credits from PR17761
* Fix failing test
* fix small typo documentation
* fix non passing test
- remove device from build alibi
* refactor call
- refactor `FlaxBloomBlockCollection` module
* make style
* upcast to fp32
* cleaner way to upcast
* remove unused args
* remove layer number
* fix scan test
* make style
* fix i4 casting
* fix slow test
* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* remove `layer_past`
* refactor a bit
* fix `scan` slow test
* remove useless import
* major changes
- remove unused code
- refactor a bit
- revert import `torch`
* major refactoring
- change build alibi
* remove scan
* fix tests
* make style
* clean-up alibi
* add integration tests
* up
* fix batch norm conversion
* style
* style
* update pt-fx cross tests
* update copyright
* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* per-weight check
* style
* line formats
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-07-27 18:24:56 +01:00
Yoach Lacombe
0b92ae3489
Add offload support to Bark ( #25037 )
...
* initial Bark offload proposal
* use hooks instead of manually offloading
* add test of bark offload to cpu feature
* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* remove unecessary set_seed in Bark tests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
2023-07-27 15:35:17 +01:00
Arthur
9cea3e7b80
[MptConfig] support from pretrained args ( #25116 )
...
* support from pretrained args
* draft addition of tests
* update test
* use parrent assert true
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2023-07-27 16:24:52 +02:00
amyeroberts
659829b6ae
MaskFormer - enable return_dict in order to compile ( #25052 )
...
* Enable return_dict in order to compile
* Update tests
2023-07-26 16:23:30 +01:00
Yih-Dar
224da5df69
update use_auth_token -> token ( #25083 )
...
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-07-26 15:09:59 +02:00
Yih-Dar
31acba5697
Fix PvtModelIntegrationTest::test_inference_fp16 ( #25106 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-07-26 14:57:44 +02:00
Sebastian Husch Lee
8f36ab3e22
[T5, MT5, UMT5] Add [T5, MT5, UMT5]ForSequenceClassification ( #24726 )
...
* Initial addition of t5forsequenceclassification
* Adding imports and adding tests
* Formatting
* Running make fix-copies
* Adding mt5forseq
* Formatting
* run make fix-copies
* Adding to docs
* Add model_parallel
* Fix bug
* Fix
* Remove TODO
* Fixing tests for T5ForSequenceClassification
* Undo changes to dependency_versions_table.py
* Change classification head to work with T5Config directly
* Change seq length to let tests pass
* PR comments for formatting
* Formatting
* Initial addition of UMT5ForSequenceClassification
* Adding to inits and formatting
* run make fix-copies
* Add doc for UMT5ForSeqClass
* Update UMT5 config
* Fix docs
* Skip torch fx test for SequenceClassification
* Formatting
* Add skip to UMT5 tests as well
* Fix umt5 tests
* Running make fix-copies
* PR comments
* Fix for change to sentence_representation
* Rename seq_len to hidden_size since that's what it is
* Use base_model to follow format of the rest of the library
* Update docs
* Extract the decoder_input_ids changes and make one liner
* Make one-liner
2023-07-25 21:02:49 +02:00
Arthur
f9cc333805
[ PreTrainedTokenizerFast] Keep properties from fast tokenizer ( #25053 )
...
* draft solution
* use `setdefault`
* nits
* add tests and fix truncation issue
* fix test
* test passes locally
* quality
* updates
* update tsets
2023-07-25 18:45:01 +02:00
Connor Henderson
0779fc8eb8
Edit err message and comment in test_model_is_small ( #25087 )
...
* Edit err message and comment in
* put back 80M comment
2023-07-25 12:24:36 -04:00
Arthur
dcb183f4bd
[MPT] Add MosaicML's MPT model to transformers ( #24629 )
...
* draft add new model like
* some cleaning of the config
* nits
* add nested configs
* nits
* update
* update
* added layer norms + triton kernels
* consider only LPLayerNorm for now.
* update
* all keys match.
* Update
* fixing nits here and there
* working forward pass.
* removed einops dependency
* nits
* format
* add alibi
* byebye head mask
* refactor attention
* nits.
* format
* fix nits.
* nuke ande updates
* nuke tokenizer test
* don't reshape query with kv heads
* added a bit of documentation.
* remove unneeded things
* nuke more stuff
* nit
* logits match - same generations
* rm unneeded methods
* 1 remaining failing CI test
* nit
* fix nits
* fix docs
* fix docs
* rm tokenizer
* fixup
* fixup
* fixup and fix tests
* fixed configuration object.
* use correct activation
* few minor fixes
* clarify docs a bit
* logits match à 1e-12
* skip and unskip a test
* added some slow tests.
* fix readme
* add more details
* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix configuration issues
* more fixes in config
* added more models
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* remove unneeded position ids
* fix some comments
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* revert suggestion
* mpt alibi + added batched generation
* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* remove init config
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix nit
* add another slow test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* fits in one line
* some refactor because make fixup doesn't pass
* add ft notebook
* update md
* correct doc path
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-07-25 14:32:40 +02:00
Xuehai Pan
6bc61aa7af
Set TF32 flag for PyTorch cuDNN backend ( #25075 )
2023-07-25 08:04:48 -04:00
Sylvain Gugger
f295fc8a16
Fix last models for common tests that are too big. ( #25058 )
...
* Fix last models for common tests that are too big.
* Remove print statement
2023-07-25 07:56:04 -04:00
Rinat
a03d13c83d
Pvt model ( #24720 )
...
* pull and push updates
* add docs
* fix modeling
* Add and run test
* make copies
* add task
* fix tests and fix small issues
* Checks on a Pull Request
* fix docs
* add desc pvt.md
2023-07-24 15:34:19 +01:00
Sylvain Gugger
afe8bfc075
Comment again print statement
2023-07-24 10:12:20 -04:00
Sylvain Gugger
42571f6eb8
Make more test models smaller ( #25005 )
...
* Make more test models tiny
* Make more test models tiny
* More models
* More models
2023-07-24 10:08:47 -04:00
Zach Mueller
3b734f5042
Add dispatch_batches to training arguments ( #25038 )
...
* Dispatch batches
* Copy items
2023-07-24 09:27:19 -04:00
Arthur
0511369a8b
[LlamaConfig] Nit: pad token should be None by default ( #24958 )
...
* pad token should be None by default
* fix tests
* nits
2023-07-21 14:32:34 +02:00
Benjamin Badger
caf5e369fc
Contrastive Search peak memory reduction ( #24120 )
...
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
2023-07-20 18:46:53 +01:00
Joao Gante
89136ff7f8
Generate: sequence bias can handle same terminations ( #24822 )
2023-07-20 12:23:17 +01:00
Tom Aarsen
79444f370f
Deprecate unused OpenLlama architecture ( #24922 )
...
* Resolve typo in check_repo.py
* Specify encoding when opening modeling files
* Deprecate the OpenLlama architecture
* Add disclaimer pointing to Llama
I'm open to different wordings here
* Match the capitalisation of LLaMA
2023-07-20 07:03:24 -04:00
Arthur
07360b6c9c
[Llama2] Add support for Llama 2 ( #24891 )
...
* add llama
* add other readmes
* update padding id in readme
* add link to paper
* fix paths and tokenizer
* more nits
* styling
* fit operation in 2 lines when possible
* nits
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* add form
* update reademe
* update readme, we don't have a default pad token
* update test and tokenization
* LLaMA instead of Llama
* nits
* add expected text
* add greeedy output
* styling
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* sequential device map
* skip relevant changes
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-07-18 15:18:31 -04:00