Raushan Turganbay
b1065aa08a
Generation: get special tokens from model config ( #30899 )
...
* fix
* let's do this way?
* codestyle
* update
* add tests
2024-05-22 18:15:41 +02:00
Arthur
1d568dfab2
legacy to init the slow tokenizer when converting from slow was wrong ( #30972 )
2024-05-22 18:06:50 +02:00
Yih-Dar
1432f641b8
Finally fix the missing new model failure CI report ( #30968 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-05-22 17:48:26 +02:00
amyeroberts
dff54ad2d9
🚨 out_indices always a list ( #30941 )
...
* out_indices always a list
* Update src/transformers/utils/backbone_utils.py
* Update src/transformers/utils/backbone_utils.py
* Move type casting
* nit
2024-05-22 15:23:04 +01:00
Pablo Montalvo
250ae9f746
Paligemma - fix slow tests, add bf16 and f16 slow tests ( #30851 )
...
* fix slow tests, add bf16 and f16 slow tests
* few fixes
* [run-slow]paligemma
* add gate decorator
* [run-slow]paligemma
* add missing gating
* [run-slow]paligemma
* [run-slow]paligemma
2024-05-22 16:20:07 +02:00
Sanchit Gandhi
ada86f973c
[whisper] only trigger forced ids warning once ( #30966 )
2024-05-22 15:06:51 +01:00
Jonatan Kłosko
1518508467
Avoid extra chunk in speech recognition ( #29539 )
2024-05-22 14:07:51 +01:00
Vaibhav Srivastav
24d2a5e1a3
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper. ( #30938 )
...
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper doc.
2024-05-22 14:06:09 +01:00
Marc Sun
5c186003b8
Fix low cpu mem usage tests ( #30808 )
...
* Fix tests
* fix udop failing test
* remove skip
* style
2024-05-22 14:09:01 +02:00
Raushan Turganbay
934e1b84e9
Update video-llava docs ( #30935 )
...
* update video-llava
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-22 16:56:41 +05:00
dependabot[bot]
edb14eba64
Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/lxmert ( #30956 )
...
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-22 11:27:41 +01:00
Arthur
8e8786e5f0
Update build ci image [push-ci-image] ( #30933 )
...
* [build-ci-image]
* correct branch
* push ci image
* [build-ci-image]
* update scheduled as well
* [push-ci-image]
* [build-ci-image]
* [push-ci-image]
* update deps
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* oups [build-ci-image]
* [push-ci-image]
* fix
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* updated
* [build-ci-image] update tag
* [build-ci-image]
* [build-ci-image]
* fix tag
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* github name
* commit_title?
* fetch
* update
* it not found
* dev
* dev
* [push-ci-image]
* dev
* dev
* update
* dev
* dev print dev commit message dev
* dev ? dev
* dev
* dev
* dev
* dev
* [build-ci-image]
* [build-ci-image]
* [push-ci-image]
* revert unwanted
* revert convert as well
* no you are not important
* [build-ci-image]
* Update .circleci/config.yml
* pin tf probability dev
2024-05-22 10:52:59 +02:00
Arthur
673440d073
update ruff version ( #30932 )
...
* update ruff version
* fix research projects
* Empty
* Fix errors
---------
Co-authored-by: Lysandre <lysandre@huggingface.co >
2024-05-22 06:40:15 +02:00
NielsRogge
60bb571e99
🚨 [Idefics2] Update ignore index ( #30898 )
...
* Update ignore index
* Update docs
* Update docs
2024-05-21 19:38:02 +02:00
Lu Teng
5bf9caa06d
Fix inhomogeneous shape error in example ( #30434 )
...
Fix inhomogeneous shape error in example.
2024-05-21 18:14:11 +01:00
amyeroberts
d24097e022
Fix swin embeddings interpolation ( #30936 )
2024-05-21 15:40:19 +01:00
Younes Belkada
eae2b6b89e
TST / Workflows: Get slack notifications for docker image build ( #30891 )
...
* Get slack notifications for docker image build
* Apply suggestions from code review
* Apply suggestions from code review
2024-05-21 15:54:41 +02:00
Yih-Dar
64e0573a81
[Benchmark] Reuse optimum-benchmark ( #30615 )
...
* benchmark
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-05-21 15:15:19 +02:00
Matthew Beckers
3b09d3f05f
fix: center_crop occasionally outputs off-by-one dimension matrix ( #30934 )
...
If required padding for a crop larger than input image is odd-numbered,
the padding would be rounded down instead of rounded up, causing the
output dimension to be one smaller than it should be.
2024-05-21 13:56:52 +01:00
Zach Mueller
daf281f44f
Enforce saving at end of training if saving option chosen ( #30160 )
...
* Enforce saving at end of training
* Fix test
* Rework test
* Fixup tests'
* Update comment based on sourab feedback
* Clean
2024-05-21 07:50:11 -04:00
Mohit Sharma
7a4792e6b3
CI: AMD MI300 tests fix ( #30797 )
...
* add fix
* update import
* updated dicts and comments
* remove prints
* Update testing_utils.py
2024-05-21 12:46:07 +01:00
hoshi-hiyouga
a755745546
PaliGemma - fix processor with no input text ( #30916 )
...
Update processing_paligemma.py
2024-05-21 10:43:22 +01:00
dependabot[bot]
d502bd6475
Bump requests from 2.31.0 to 2.32.0 in /examples/research_projects/decision_transformer ( #30925 )
...
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 09:41:29 +01:00
Younes Belkada
8871b26150
FEAT / Trainer: LOMO optimizer support ( #30178 )
...
* add V1 - adalomo not working yet
* add todo docs + refactor from comments
* adjust LR
* add docs
* add more elaborated test
* Apply suggestions from code review
Co-authored-by: Zach Mueller <muellerzr@gmail.com >
* fix
* push
* add accelerate check
* fix DDP case
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* fix
* init kwargs
* safely add attribute
* revert to enum logic
* Update src/transformers/trainer.py
---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-21 10:16:37 +02:00
Younes Belkada
c876d12127
FIX / TST: Fix expected results on Mistral slow test (A10) ( #30909 )
...
Update test_modeling_mistral.py
2024-05-21 09:14:14 +02:00
Aaron Jimenez
0df888ffb7
[docs] Spanish translation of model_memory_anatomy.md ( #30885 )
...
* add model_memory_anatomy to es/_toctree.yml
* copy model_memory_anatomy.md to es/
* translate first section
* translate doc
* chage forward activations
* fix sentence and and link to Trainer
* fix Trainer link
2024-05-20 16:48:52 -07:00
Longjie Zheng
616bb11d48
Add torch.compile for Mistral ( #30642 )
...
* first version
* fix sliding window
* fix style
* add sliding window cache
* fix style
* address comments
* fix test
* fix style
* move sliding window check inside cache init
* revert changes on irrelevant files & add comment on SlidingWindowCache
* address comments & fix style
fix style
* update causal mask
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] llama
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* revert CI from a10 to t4
* wrap up
2024-05-20 16:27:24 +02:00
Zach Mueller
92d1d97c05
Introduce configured_state arg for accelerator_config ( #29781 )
...
* Introduce configured_state
* Include note on tuning
* Allow for users to have defined a state already
* Include tests
* Add note on hpam tune
* Guard a bit better
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Finish rebase
* Finish rebase
* Guard carefully
* Fixup test
* Refactor
* Fin refactor
* Comment
* Update wrt feedback
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-20 09:21:40 -04:00
Arthur
bb48e92186
tokenizer_class = "AutoTokenizer" Llava Family (#30912 )
...
propagate changes to more models
2024-05-20 13:56:11 +02:00
Anton Vlasjuk
76e05301c3
Fix a shape annotation and typos in mamba slow forward ( #30691 )
...
* fix typos and one shape comment
* fix `intermediade` typo in jamba
2024-05-20 13:55:57 +02:00
Yoach Lacombe
e6708709cb
Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM ( #28706 )
...
* Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM
* update with a type filter
* add raises error test
* fix added test
2024-05-20 13:40:42 +02:00
Hafedh
c11ac7857b
fix for custom pipeline configuration ( #29004 )
...
* fix for custom pipeline configuration
* fix for custom pipelines
* remove extra exception
* added test for custom pipelines extra tag
* format with ruff
* limit extra tag for first time only
* format with ruff
* improve tests for custom pipelines
2024-05-20 11:38:32 +02:00
Eric2i
7b4b456438
separate kwargs in processor (similar to #30193 ) ( #30905 )
...
* Fix similar bug in processor (related to #30193 )
* Reformat processing_git.py to comply with ruff formatting
2024-05-20 10:18:17 +01:00
Goncalo Paulo
1834916481
Fix num_hidden_layers in initialization of new model in Mamba ( #30403 )
...
Fix num_hidden_layers in initialization
Originally, the initialization was using config.num_layers instead of config.num_hidden_layers. This fixes that.
2024-05-20 11:18:09 +02:00
Kamil Akesbi
1c2bb3ac54
add return_token_timestamps to WhisperProcessor ( #30812 )
...
* compute num_frames in WhisperFeatureExtractor
* add return_num_frames in WhisperFeatureProcessor + adapt pipeline
* return_timestamps renaming + pipeline fix
* fix
* fix
* fix
* add tests
* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* apply review changes
* fix
* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* Update tests/models/whisper/test_modeling_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* apply review
* fix
* review changes
* Update src/transformers/models/whisper/feature_extraction_whisper.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* make style quality
* EXPECTED_OUTPUT in single line
* small numpy->torch fix
* fix
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-20 09:53:58 +01:00
Donggeun Yu
66b0d9ee5d
DeformableDETR two stage support bfloat16 ( #30907 )
...
Update modeling_deformable_detr.py
2024-05-20 09:51:04 +01:00
Raushan Turganbay
5d0bf59b4d
LLaVa-Next: Update docs with batched inference ( #30857 )
...
* update docs with batch ex
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
* accept nested list of img
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
2024-05-20 13:45:56 +05:00
Benjamin Warner
cd6bd0af34
Add support for torch.compile dynamic shapes ( #30560 )
...
* add torch.compile dynamic support
* Add SDPA dynamic shapes compile test & improve SDPA comment
* comment consistency
2024-05-20 10:36:57 +02:00
Younes Belkada
fce78fd0e9
FIX / Quantization: Fix Dockerfile build ( #30890 )
...
* Update Dockerfile
* Update docker/transformers-quantization-latest-gpu/Dockerfile
2024-05-20 10:08:26 +02:00
Joseph Enguehard
07bf2dff78
Add TokenClassification for Mistral, Mixtral and Qwen2 ( #29878 )
...
* Add MistralForTokenClassification
* Add tests and docs
* Add token classification for Mixtral and Qwen2
* Save llma for token classification draft
* Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2
* Formatting
* Add token classification support for Qwen2Moe model
* Add dropout layer to each ForTokenClassification model
* Add copied from in tests
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* Propagate suggested changes
* Style
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-05-20 10:06:57 +02:00
Abhiroop Tejomay
481a957814
Enable dynamic resolution input for Swin Transformer and variants ( #30656 )
...
* add interpolation of positional encoding support to swin
* add style changes
* use default image processor and make size a dictionary
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* remove logits testing
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Refactor image size validation logic when interpolation is disabled
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* remove asserts in modeling
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* add dynamic resolution input support to swinv2
* change size to ensure interpolation encoding path is triggered
* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* set interpolate_pos_encoding default value to False
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* set interpolate_pos_encoding default value to False
* add dynamic resolution input to donut swin
* add dynamic resolution input to maskformer swin
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-17 18:38:46 +01:00
Arthur Zucker
b6eb708bf1
v4.42.dev.0
2024-05-17 17:30:41 +02:00
Pavel Iakubovskii
bf646fbf2d
Add fixed resize and pad strategy for object detection ( #30742 )
...
* Add resize and pad strategy
* Merge get_size functions
* Add pad_size + tests to object detection models
* Fixup
* Update docstrings
* Fixup
2024-05-17 16:21:26 +01:00
Arthur
e9a8041d1c
update release script ( #30880 )
...
* update release script
* update release script
2024-05-17 17:09:30 +02:00
Arthur
0a9300f474
Support arbitrary processor ( #30875 )
...
* Support arbitrary processor
* fix
* nit
* update
* nit
* nit
* fix and revert
* add a small test
* better check
* fixup
* bug so let's just use class for now
* oups
* .
2024-05-17 16:51:31 +02:00
Sanchit Gandhi
57edd84bdb
[whisper] fix multilingual fine-tuning ( #30865 )
...
* [whisper] fix multilingual fine-tuning
* config ids as well
2024-05-17 15:12:44 +01:00
Jacky Lee
977ce58a78
Fix dependencies for image classification example ( #30842 )
...
* fix: missing dependencies
* fix: image classification dependencies
2024-05-17 13:57:47 +01:00
Darshana S
3802e786ef
Enable device map ( #30870 )
...
* added_no_split_modules
* added LlavaNextVisionAttention to _no_split_modules
2024-05-17 12:50:24 +01:00
amyeroberts
57c965a8f1
Remove deprecated logic and warnings ( #30743 )
...
* Remove deprecated logic and warnings
* Add back some code that seems to be important...
* Let's just add all he nllb stuff back; removing it is a bit more involved
* Remove kwargs
* Remove more kwargs
2024-05-17 12:15:59 +01:00
Younes Belkada
3d7d3a87a0
TEST: Add llama logits tests ( #30835 )
...
* add llama logits test
* fix
* fix tests
"
"
* fix for a10
* format
* format
* fix
* [run-slow] remove fmt: skip
* Your commit message
* test commit
* Revert "test commit"
This reverts commit b66e01e55f5e31d4c0479cac4bcacc0f123dc9d2.
* [run-slow]llama
* Update tests/models/llama/test_modeling_llama.py
* [run-slow]llama
* empty commit
2024-05-17 12:23:00 +02:00