Commit Graph

19670 Commits

Author SHA1 Message Date
renet10
b85ed49e0a Corrections to PR #38642 and enhancements to Wav2Vec2Processor __call__ and pad docstrings (#38822)
* Correcting PR #38642.  The PR removed references to the deprecated method "as_target_processor()" in the
__call__ and pad method docstrings, which is correct, but also removed all references to PreTrainedTokenizer,
which is incorrect.  This commit adds back the reference to PreTrainedTokenizer and also takes the
opportunity to enhance the docstrings with the invocation procedure post removal of "as_target_processor()"
and adds information on return values.

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: René Tio <tor@Jammer.local>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-16 14:13:07 -07:00
Dhruv Malik
787a0128a9 create ijepa modelcard (ref : PR #36979 ). (#39354)
* wip: adding first version of the IJEPA model card.

* refactor based on the @stevhliu feedbacks

* refactor:
- revert the accidental removal of the autodoc api description and the image reerece architecture

- general context updation.

* - changes of model for example quantization.
- merging the  quantization content.
2025-07-16 12:40:22 -07:00
ridima11
48f2233cdf Improve grammar and clarity in perf_hardware.md (#39428) 2025-07-16 12:15:15 -07:00
Yaowei Zheng
e68ebb695f fix cached file error when repo type is dataset (#36909)
* fix cached file

* Update hub.py
2025-07-16 18:02:26 +02:00
Krishnan Vignesh
35a416c400 Fix indentation bug in SmolVLM image processor causing KeyError (#39452)
Fix indentation bug in Idefics3 image processor

- Fix KeyError when do_image_splitting=False
- Move split_images_grouped assignment inside loop
- Ensures all image shapes are stored, not just the last one
- This fixes the bug in both Idefics3 and generated SmolVLM processors

cc @yonigozlan

Co-authored-by: Krishnan Vignesh <krishnanvignesh@Krishnans-MacBook-Air.local>
2025-07-16 11:59:28 -04:00
Luke Friedrichs
2c58705dc2 Updated Megatron conversion script for gpt2 checkpoints (#38969)
* update script to support new megatron gpt format

* fixed quality failures

---------

Co-authored-by: Luke Friedrichs <LckyLke>
2025-07-16 15:54:29 +00:00
Anton Vlasjuk
26be7f717e [CI] Fix partially red CI (#39448)
fix
2025-07-16 15:53:43 +02:00
sebastianvlad1
0a88751940 Fixes #39204: add fallback if get_base_model missing (#39226)
* Fixes #39204: add fallback if get_base_model missing

* Inline try_get_base_model logic as suggested in PR review

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-07-16 15:51:30 +02:00
Wing Lian
ba506f87db make the loss context manager easier to extend (#39321) 2025-07-16 15:47:24 +02:00
Arthur
9f1ac6f185 Remove something that should have never been there (#38254)
* what the hell

* update

* style

* style

* typing

* fix init issue

* fix granite moe hybrid as well
2025-07-16 15:22:44 +02:00
Raushan Turganbay
a7ca5b5d67 Fix processor tests (#39450)
fix
2025-07-16 15:01:35 +02:00
Kyle Sayers
71818f570b [Bugfix] [Quantization] Remove unused init arg (#39324)
remove unused arg from ct config init

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-07-16 14:57:42 +02:00
Pavel Iakubovskii
cc24b0378e Better typing for model.config (#39132)
* Apply to all models config annotation

* Update modular to preserve order

* Apply modular

* fix define docstring

* fix dinov2 consistency (docs<->modular)

* fix InstructBlipVideoForConditionalGeneration docs<->modular consistency

* fixup

* remove duplicate code

* Delete config_class attribute from the modeling code

* Add config_class attribute in base model

* Update init sub class

* Deprecated models update

* Update new models

* Fix remote code BC issue

* fixup

* fixing more corner cases

* fix new models

* add test

* modular docs update

* fix comment a bit

* fix for py3.9
2025-07-16 14:50:35 +02:00
Eon Kim
4b258454a7 Fix typo in generation configuration for Janus model weight conversion (#39432)
* Fix typo in generation configuration for Janus model weight conversion

* Fix typo

* Update Janus model generation configuration

* Update Janus model to use generation_kwargs
2025-07-16 14:28:02 +02:00
Lysandre Debut
de5ca373ac Responses API in transformers serve (#39155)
* Scaffolding

* Explicit content

* Naïve Responses API streaming implementation

* Cleanup

* Responses API (to be merged into #39155) (#39338)

* Scaffolding

* Explicit content

* Naïve Responses API streaming implementation

* Cleanup

* use openai

* validate request, including detecting unused fields

* dict indexing

* dict var access

* tmp commit (tests failing)

* add slow

* use oai output type in completions

* (little rebase errors)

* working spec?

* guard type hint

* type hints. fix state (CB can now load different models)

* type hints; fn names; error type

* add docstrings

* responses + kv cache

* metadata support; fix kv cache; error event

* add output_index and content_index

* docstrings

* add test_build_response_event

* docs/comments

* gate test requirements; terminate cb manager on model switch

* nasty type hints

* more type hints

* disable validation by default; enable force models

* todo

---------

Co-authored-by: Lysandre <hi@lysand.re>

* Slight bugfixes

* PR comments from #39338

* make fixup

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2025-07-16 14:16:16 +02:00
Raushan Turganbay
c8524aeb07 [cache] make all classes cache compatible finally (#38635)
* dump

* push other models

* fix simple greedy generation

* xmod

* add fmst and clean up some mentions of old cache format

* gpt-bigcode now follows standards

* delete tuple cache reference in generation

* fix some models

* fix some models

* fix mambas and support cache in tapas

* fix some more tests

* fix copies

* delete `_reorder_cache`

* another fix copies

* fix typos and delete unnecessary test

* fix rag generate, needs special cache reordering

* fix tapas and superglue

* reformer create special cache

* recurrent gemma `reorder_cache` was a no-op, delete

* fix-copies

* fix blio and musicgen pipeline tests

* fix reformer

* fix reformer, again...

* delete `_supports_cache_class`

* delete `supports_quantized_cache`

* fix failing tests

* fix copies

* some minor clean up

* style

* style

* fix copies

* fix tests

* fix copies

* create causal mask now needs positions?

* fixc copies

* style

* Update tests/test_modeling_common.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* clean-up of non-generative model after merging main

* check `is_decoder` for cache

* delete transpose for scores

* remove tuple cache from docs everywhere

* fix tests

* fix copies

* fix copies once more

* properly deprecate `encoder_attention_mask` in Bert-like models

* import `deprecate_kwarg` where needed

* fix copies again

* fix copies

* delete `nex_decoder_cache`

* fix copies asks to update for PLM

* fix copies

* rebasing had a few new models, fix them and merge asap!

* fix copies once more

* fix slow tests

* fix tests and updare PLM checkpoint

* add read token and revert accidentally removed line

* oh com -on, style

* just skip it, read token has no access to PLM yet

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-07-16 14:00:17 +02:00
Ilias Aarab
6cb43defd0 docs: add missing numpy import to minimal example (#39444)
docs: add numpy import to minimal example
2025-07-16 11:57:13 +00:00
Yuanyuan Chen
61163099f1 Remove runtime conditions for type checking (#37340)
Remove dynamic conditions for type checking

Signed-off-by: cyy <cyyever@outlook.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-07-16 13:36:48 +02:00
Marc Sun
bfc9ddf5c6 Add StableAdamW Optimizer (#39446)
* Added StableAdamW as an optimizer option for Trainer. Also wrote tests to verify its behaviour.

* Fixed issue with

* Added docs for StableAdamW. Also fixed a typo in schedule free optimizers

---------

Co-authored-by: Gautham Krithiwas <gauthamkrithiwas2003@gmail.com>
2025-07-16 13:35:53 +02:00
Pablo Montalvo
b9ee528246 add test scanner (#39419)
* add test scanner

* add doc + license

* refactor for only 1 tree traversal

* add back test of only one method

* document single method scan

* format

* fixup generate tests

* minor fix

* fixup

* fixup doc
2025-07-16 12:45:46 +02:00
Ákos Hadnagy
79941c61ce Fix missing definition of diff_file_url in notification service (#39445)
Fix missing definition of diff_file_url
2025-07-16 12:09:18 +02:00
richardodliu
e048d48bd0 Add cosine_with_min_lr_schedule_with_warmup_lr_rate scheduler in Trainer (#31870)
* add cosine_with_min_lr_schedule_with_warmup_lr_rate scheduler in trainer

* Update src/transformers/optimization.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update optimization.py

fix the error of the unclosed "("

* Update optimization.py

remove whitespace in line 402 in order to pass the quality test

* Update src/transformers/optimization.py

* Update src/transformers/optimization.py

* Apply style fixes

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-07-16 12:01:08 +02:00
Quentin Gallouédec
0cf08e90dd Change log level from warning to info for scheduled request logging in ContinuousBatchProcessor (#39372)
Change log level from warning to info for scheduled request logging in ContinuousBatchProcessor
2025-07-16 11:54:20 +02:00
Yuanyuan Chen
ae4e306a40 Defaults to adamw_torch_fused for Pytorch>=2.8 (#37358)
* Defaults to adamw_torch_fused for latest Pytorch

Signed-off-by: cyy <cyyever@outlook.com>

* Fix test

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-07-16 09:52:33 +00:00
Jeonghwan Kim
4524a68c66 Fix L270 - hasattr("moe_args") returning False error (#38715)
* Fix L270 - hasattr("moe_args") returning False error

* Update src/transformers/models/llama4/convert_llama4_weights_to_hf.py

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-07-16 09:45:58 +00:00
Raushan Turganbay
d33a1c389f [chat template] add a testcase for kwargs (#39415)
add a testcase
2025-07-16 11:31:35 +02:00
S1quence
99c9763398 Fixed a bug calculating cross entropy loss in JetMoeForCausalLM (#37830)
fix: 🐛 Fixed a bug in calculating Cross Entropy loss in JetMoeForCausalLM

In the original code, we shift the logits and pass shift_logits into the self.loss_function, but in self.loss_function, the shift_logits will be shifted again, so we are actually doing "next next token prediction", which is incorrect. I have removed the logits shifting before calling self.loss_function.

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-07-16 11:22:00 +02:00
Klaus-Rudolf Kladny
667ad02374 Remove double soft-max in load-balancing loss. Fixes #39055 . (#39056)
Remove double soft-max in load-balancing loss. Fixes #39055
2025-07-16 09:20:23 +00:00
Kyle Sayers
31d81943c9 [Core] [Offloading] Fix saving offloaded submodules (#39280)
* fix counting meta tensors, fix onloading meta tensors

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove unrelated fix

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove unrelated change

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add clarifying comment

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add test_save_offloaded_model_with_direct_params

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix merge conflict, add decorators

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-07-16 08:44:40 +00:00
Raushan Turganbay
add43c4d09 [autodocstring] add video and audio inputs (#39420)
* add  video and audio inputs in auto docstring

* fix copies
2025-07-16 09:41:50 +02:00
Ákos Hadnagy
0dc2df5dda CI workflow for performed test regressions (#39198)
* WIP script to compare test runs for models

* Update line normalitzation logic

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-07-16 04:20:02 +02:00
StevenBucaille
1bc9ac5107 docs: update LightGlue docs (#39407)
* docs: update LightGlue docs

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-15 12:40:50 -07:00
StevenBucaille
d9574f2fe3 docs: update SuperGlue docs (#39406)
* docs: update SuperGlue docs

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-15 12:40:26 -07:00
Raushan Turganbay
9f41f67135 [vlm] fix loading of retrieval VLMs (#39242)
* fix vlm with retrieval

* we can't use AutoModel because new ColQwen was released after refactor

* no need for colqwen

* tied weight keys are necessary, if using IMageTextToText

* need to apply renaming in tied weights, only for ColPali

* overwrite tied keys in ColPali

* fix copies, modular can't handle if-statements
2025-07-15 17:23:54 +02:00
Wing Lian
b1d14086e4 handle training summary when creating modelcard but offline mode is set (#37095)
* handle training summary when creating modelcard but offline mode is set

* chore: lint
2025-07-15 17:21:15 +02:00
Dario Salvati
67f42928f0 Remove residual quantization attribute from dequantized models (#39373)
* fix: removing quantization trace attribute from dequantized model

Fixes #39295

* add: test `to(dtype=torch.float16)` after dequantization
2025-07-15 17:16:10 +02:00
Wangyi Jiang
30c508dbcb Remove deprecated audio utils functions (#39330)
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-07-15 14:02:25 +00:00
Hosein Rezaei
d8e05951b8 Fix bugs in pytorch example run_clm when streaming is enabled (#39286) 2025-07-15 15:37:28 +02:00
Matt
a989bf8d84 Fix bugs from pipeline preprocessor overhaul (#39425)
* Correct load classes for VideoClassificationPipeline

* Correct load classes for the ASR pipeline
2025-07-15 14:28:59 +01:00
Luc Georges
53c9dcd6fd refactor: remove set_tracer_provider and set_meter_provider calls (#39422) 2025-07-15 14:22:12 +02:00
Yuanyuan Chen
f03b384149 Fix invalid property (#39384)
Signed-off-by: cyy <cyyever@outlook.com>
2025-07-15 12:11:37 +00:00
jiqing-feng
c4d41567fa set document_question_answering pipeline _load_tokenizer to True (#39411)
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-07-15 12:05:49 +00:00
Matt
f56b49f48f Ignore extra position embeddings weights for ESM (#39063)
* Ignore extra position embeddings weights

* Slight name fix
2025-07-15 11:57:32 +00:00
44670
2b79f14375 support loading qwen3 gguf (#38645)
* support loading qwen3 gguf

* Add qwen3 into GGUF_TO_FAST_CONVERTERS for tokenizer conversion

* Add testcase

* Fix formatting
2025-07-15 09:53:41 +00:00
Orion Weller
0e4b7938d0 Add ModernBERT Decoder Models - ModernBERT, but trained with CLM! (#38967)
Some checks failed
Release - Conda / build_and_package (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
* working locally; need to style and test

* added docs and initial tests; need to debug and flesh out

* fixed tests

* working long context; batches

* working fa2 and eager

* update tests

* add missing confnigs

* remove default autoset

* fix spacing

* fix most tests

* fixed tests

* fix to init

* refactor to match new transformers updates

* remove static cache option

* fa2 fix

* fix docs

* in progress

* working on tests

* fixed issue with attn outputs

* remove debug

* fix local config attr

* update doc string

* fix docstring

* add docs to toc

* correct typo in toc

* add new updates from main w.r.t. ModernBERT RoPE

* fix local param

---------

Co-authored-by: oweller2 <oweller2@dsailogin.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l07.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@n02.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l08.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l01.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l02.mgmt.ai.cluster>
v4.53.2-modernbert-decoder-preview
2025-07-15 10:40:41 +02:00
Alvaro Bartolome
0b724114cf Fix typo in /v1/models output payload (#39414) 2025-07-15 08:59:25 +01:00
Raushan Turganbay
8d6259b0b8 [refactor] set attention implementation (#38974)
* update

* fix some tests

* init from config, changes it in-place, add deepcopy in tests

* fix modernbert

* don't delete thsi config attr

* update

* style and copies

* skip tests in generation

* fix style

* accidentally removed flash-attn-3, revert

* docs

* forgot about flags set to False

* fix copies

* address a few comments

* fix copies

* custom code BC
2025-07-15 09:34:06 +02:00
Sameeraja Shyam
6017f5e8ed [siglip] fix pooling comment (#39378)
* feat(siglip2): add forward pass with pooled output logic in Siglip2TextModel

* test(siglip2): add test_text_model.py to verify pooled output behavior

* style(siglip2): fix formatting in test_text_model.py using Ruff

* fix(siglip2): remove misleading 'sticky EOS' comment and sync modular-classic files

* fix(siglip2): remove misleading 'sticky EOS' comment and sync modular-classic files

* chore(siglip2): regenerate classic model after modular change

* Update
2025-07-14 17:47:19 +00:00
Tanuj Rai
8d40ca5749 Update phi4_multimodal.md (#38830)
* Update phi4_multimodal.md

* Update docs/source/en/model_doc/phi4_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/phi4_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/phi4_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/phi4_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/phi4_multimodal.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update phi4_multimodal.md

* Update phi4_multimodal.md

* Update phi4_multimodal.md

* Update phi4_multimodal.md

* Update phi4_multimodal.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-14 10:35:17 -07:00
MilkClouds
3635415af2 [Docs] Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic (#39391)
Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic
2025-07-14 09:25:06 -07:00