HuggingFace_transformer

Author	SHA1	Message	Date
renet10	b85ed49e0a	Corrections to PR #38642 and enhancements to Wav2Vec2Processor __call__ and pad docstrings (#38822 ) * Correcting PR #38642. The PR removed references to the deprecated method "as_target_processor()" in the __call__ and pad method docstrings, which is correct, but also removed all references to PreTrainedTokenizer, which is incorrect. This commit adds back the reference to PreTrainedTokenizer and also takes the opportunity to enhance the docstrings with the invocation procedure post removal of "as_target_processor()" and adds information on return values. * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: René Tio <tor@Jammer.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-16 14:13:07 -07:00
Dhruv Malik	787a0128a9	create ijepa modelcard (ref : PR #36979 ). (#39354 ) * wip: adding first version of the IJEPA model card. * refactor based on the @stevhliu feedbacks * refactor: - revert the accidental removal of the autodoc api description and the image reerece architecture - general context updation. * - changes of model for example quantization. - merging the quantization content.	2025-07-16 12:40:22 -07:00
ridima11	48f2233cdf	Improve grammar and clarity in perf_hardware.md (#39428 )	2025-07-16 12:15:15 -07:00
Yaowei Zheng	e68ebb695f	fix cached file error when repo type is dataset (#36909 ) * fix cached file * Update hub.py	2025-07-16 18:02:26 +02:00
Krishnan Vignesh	35a416c400	Fix indentation bug in SmolVLM image processor causing KeyError (#39452 ) Fix indentation bug in Idefics3 image processor - Fix KeyError when do_image_splitting=False - Move split_images_grouped assignment inside loop - Ensures all image shapes are stored, not just the last one - This fixes the bug in both Idefics3 and generated SmolVLM processors cc @yonigozlan Co-authored-by: Krishnan Vignesh <krishnanvignesh@Krishnans-MacBook-Air.local>	2025-07-16 11:59:28 -04:00
Luke Friedrichs	2c58705dc2	Updated Megatron conversion script for gpt2 checkpoints (#38969 ) * update script to support new megatron gpt format * fixed quality failures --------- Co-authored-by: Luke Friedrichs <LckyLke>	2025-07-16 15:54:29 +00:00
Anton Vlasjuk	26be7f717e	[`CI`] Fix partially red CI (#39448 ) fix	2025-07-16 15:53:43 +02:00
sebastianvlad1	0a88751940	Fixes #39204 : add fallback if get_base_model missing (#39226 ) * Fixes #39204: add fallback if get_base_model missing * Inline try_get_base_model logic as suggested in PR review * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-07-16 15:51:30 +02:00
Wing Lian	ba506f87db	make the loss context manager easier to extend (#39321 )	2025-07-16 15:47:24 +02:00
Arthur	9f1ac6f185	Remove something that should have never been there (#38254 ) * what the hell * update * style * style * typing * fix init issue * fix granite moe hybrid as well	2025-07-16 15:22:44 +02:00
Raushan Turganbay	a7ca5b5d67	Fix processor tests (#39450 ) fix	2025-07-16 15:01:35 +02:00
Kyle Sayers	71818f570b	[Bugfix] [Quantization] Remove unused init arg (#39324 ) remove unused arg from ct config init Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-07-16 14:57:42 +02:00
Pavel Iakubovskii	cc24b0378e	Better typing for model.config (#39132 ) * Apply to all models config annotation * Update modular to preserve order * Apply modular * fix define docstring * fix dinov2 consistency (docs<->modular) * fix InstructBlipVideoForConditionalGeneration docs<->modular consistency * fixup * remove duplicate code * Delete config_class attribute from the modeling code * Add config_class attribute in base model * Update init sub class * Deprecated models update * Update new models * Fix remote code BC issue * fixup * fixing more corner cases * fix new models * add test * modular docs update * fix comment a bit * fix for py3.9	2025-07-16 14:50:35 +02:00
Eon Kim	4b258454a7	Fix typo in generation configuration for Janus model weight conversion (#39432 ) * Fix typo in generation configuration for Janus model weight conversion * Fix typo * Update Janus model generation configuration * Update Janus model to use generation_kwargs	2025-07-16 14:28:02 +02:00
Lysandre Debut	de5ca373ac	Responses API in `transformers serve` (#39155 ) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * Responses API (to be merged into #39155) (#39338) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * use openai * validate request, including detecting unused fields * dict indexing * dict var access * tmp commit (tests failing) * add slow * use oai output type in completions * (little rebase errors) * working spec? * guard type hint * type hints. fix state (CB can now load different models) * type hints; fn names; error type * add docstrings * responses + kv cache * metadata support; fix kv cache; error event * add output_index and content_index * docstrings * add test_build_response_event * docs/comments * gate test requirements; terminate cb manager on model switch * nasty type hints * more type hints * disable validation by default; enable force models * todo --------- Co-authored-by: Lysandre <hi@lysand.re> * Slight bugfixes * PR comments from #39338 * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2025-07-16 14:16:16 +02:00
Raushan Turganbay	c8524aeb07	[cache] make all classes cache compatible finally (#38635 ) * dump * push other models * fix simple greedy generation * xmod * add fmst and clean up some mentions of old cache format * gpt-bigcode now follows standards * delete tuple cache reference in generation * fix some models * fix some models * fix mambas and support cache in tapas * fix some more tests * fix copies * delete `_reorder_cache` * another fix copies * fix typos and delete unnecessary test * fix rag generate, needs special cache reordering * fix tapas and superglue * reformer create special cache * recurrent gemma `reorder_cache` was a no-op, delete * fix-copies * fix blio and musicgen pipeline tests * fix reformer * fix reformer, again... * delete `_supports_cache_class` * delete `supports_quantized_cache` * fix failing tests * fix copies * some minor clean up * style * style * fix copies * fix tests * fix copies * create causal mask now needs positions? * fixc copies * style * Update tests/test_modeling_common.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * clean-up of non-generative model after merging main * check `is_decoder` for cache * delete transpose for scores * remove tuple cache from docs everywhere * fix tests * fix copies * fix copies once more * properly deprecate `encoder_attention_mask` in Bert-like models * import `deprecate_kwarg` where needed * fix copies again * fix copies * delete `nex_decoder_cache` * fix copies asks to update for PLM * fix copies * rebasing had a few new models, fix them and merge asap! * fix copies once more * fix slow tests * fix tests and updare PLM checkpoint * add read token and revert accidentally removed line * oh com -on, style * just skip it, read token has no access to PLM yet --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-16 14:00:17 +02:00
Ilias Aarab	6cb43defd0	docs: add missing numpy import to minimal example (#39444 ) docs: add numpy import to minimal example	2025-07-16 11:57:13 +00:00
Yuanyuan Chen	61163099f1	Remove runtime conditions for type checking (#37340 ) Remove dynamic conditions for type checking Signed-off-by: cyy <cyyever@outlook.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-16 13:36:48 +02:00
Marc Sun	bfc9ddf5c6	Add StableAdamW Optimizer (#39446 ) * Added StableAdamW as an optimizer option for Trainer. Also wrote tests to verify its behaviour. * Fixed issue with * Added docs for StableAdamW. Also fixed a typo in schedule free optimizers --------- Co-authored-by: Gautham Krithiwas <gauthamkrithiwas2003@gmail.com>	2025-07-16 13:35:53 +02:00
Pablo Montalvo	b9ee528246	add test scanner (#39419 ) * add test scanner * add doc + license * refactor for only 1 tree traversal * add back test of only one method * document single method scan * format * fixup generate tests * minor fix * fixup * fixup doc	2025-07-16 12:45:46 +02:00
Ákos Hadnagy	79941c61ce	Fix missing definition of diff_file_url in notification service (#39445 ) Fix missing definition of diff_file_url	2025-07-16 12:09:18 +02:00
richardodliu	e048d48bd0	Add cosine_with_min_lr_schedule_with_warmup_lr_rate scheduler in Trainer (#31870 ) * add cosine_with_min_lr_schedule_with_warmup_lr_rate scheduler in trainer * Update src/transformers/optimization.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update optimization.py fix the error of the unclosed "(" * Update optimization.py remove whitespace in line 402 in order to pass the quality test * Update src/transformers/optimization.py * Update src/transformers/optimization.py * Apply style fixes --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-16 12:01:08 +02:00
Quentin Gallouédec	0cf08e90dd	Change log level from warning to info for scheduled request logging in `ContinuousBatchProcessor` (#39372 ) Change log level from warning to info for scheduled request logging in ContinuousBatchProcessor	2025-07-16 11:54:20 +02:00
Yuanyuan Chen	ae4e306a40	Defaults to adamw_torch_fused for Pytorch>=2.8 (#37358 ) * Defaults to adamw_torch_fused for latest Pytorch Signed-off-by: cyy <cyyever@outlook.com> * Fix test Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-16 09:52:33 +00:00
Jeonghwan Kim	4524a68c66	Fix L270 - hasattr("moe_args") returning False error (#38715 ) * Fix L270 - hasattr("moe_args") returning False error * Update src/transformers/models/llama4/convert_llama4_weights_to_hf.py --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-16 09:45:58 +00:00
Raushan Turganbay	d33a1c389f	[chat template] add a testcase for kwargs (#39415 ) add a testcase	2025-07-16 11:31:35 +02:00
S1quence	99c9763398	Fixed a bug calculating cross entropy loss in `JetMoeForCausalLM` (#37830 ) fix: 🐛 Fixed a bug in calculating Cross Entropy loss in JetMoeForCausalLM In the original code, we shift the logits and pass shift_logits into the self.loss_function, but in self.loss_function, the shift_logits will be shifted again, so we are actually doing "next next token prediction", which is incorrect. I have removed the logits shifting before calling self.loss_function. Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-16 11:22:00 +02:00
Klaus-Rudolf Kladny	667ad02374	Remove double soft-max in load-balancing loss. Fixes #39055 . (#39056 ) Remove double soft-max in load-balancing loss. Fixes #39055	2025-07-16 09:20:23 +00:00
Kyle Sayers	31d81943c9	[Core] [Offloading] Fix saving offloaded submodules (#39280 ) * fix counting meta tensors, fix onloading meta tensors Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove unrelated fix Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove unrelated change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * add clarifying comment Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * add test_save_offloaded_model_with_direct_params Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * fix merge conflict, add decorators Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-07-16 08:44:40 +00:00
Raushan Turganbay	add43c4d09	[autodocstring] add video and audio inputs (#39420 ) * add video and audio inputs in auto docstring * fix copies	2025-07-16 09:41:50 +02:00
Ákos Hadnagy	0dc2df5dda	CI workflow for performed test regressions (#39198 ) * WIP script to compare test runs for models * Update line normalitzation logic * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-07-16 04:20:02 +02:00
StevenBucaille	1bc9ac5107	docs: update LightGlue docs (#39407 ) * docs: update LightGlue docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-15 12:40:50 -07:00
StevenBucaille	d9574f2fe3	docs: update SuperGlue docs (#39406 ) * docs: update SuperGlue docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-15 12:40:26 -07:00
Raushan Turganbay	9f41f67135	[vlm] fix loading of retrieval VLMs (#39242 ) * fix vlm with retrieval * we can't use AutoModel because new ColQwen was released after refactor * no need for colqwen * tied weight keys are necessary, if using IMageTextToText * need to apply renaming in tied weights, only for ColPali * overwrite tied keys in ColPali * fix copies, modular can't handle if-statements	2025-07-15 17:23:54 +02:00
Wing Lian	b1d14086e4	handle training summary when creating modelcard but offline mode is set (#37095 ) * handle training summary when creating modelcard but offline mode is set * chore: lint	2025-07-15 17:21:15 +02:00
Dario Salvati	67f42928f0	Remove residual quantization attribute from dequantized models (#39373 ) * fix: removing quantization trace attribute from dequantized model Fixes #39295 * add: test `to(dtype=torch.float16)` after dequantization	2025-07-15 17:16:10 +02:00
Wangyi Jiang	30c508dbcb	Remove deprecated audio utils functions (#39330 ) Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-15 14:02:25 +00:00
Hosein Rezaei	d8e05951b8	Fix bugs in pytorch example run_clm when streaming is enabled (#39286 )	2025-07-15 15:37:28 +02:00
Matt	a989bf8d84	Fix bugs from pipeline preprocessor overhaul (#39425 ) * Correct load classes for VideoClassificationPipeline * Correct load classes for the ASR pipeline	2025-07-15 14:28:59 +01:00
Luc Georges	53c9dcd6fd	refactor: remove `set_tracer_provider` and `set_meter_provider` calls (#39422 )	2025-07-15 14:22:12 +02:00
Yuanyuan Chen	f03b384149	Fix invalid property (#39384 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-07-15 12:11:37 +00:00
jiqing-feng	c4d41567fa	set document_question_answering pipeline _load_tokenizer to True (#39411 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-15 12:05:49 +00:00
Matt	f56b49f48f	Ignore extra position embeddings weights for ESM (#39063 ) * Ignore extra position embeddings weights * Slight name fix	2025-07-15 11:57:32 +00:00
44670	2b79f14375	support loading qwen3 gguf (#38645 ) * support loading qwen3 gguf * Add qwen3 into GGUF_TO_FAST_CONVERTERS for tokenizer conversion * Add testcase * Fix formatting	2025-07-15 09:53:41 +00:00
Orion Weller	0e4b7938d0	Add ModernBERT Decoder Models - ModernBERT, but trained with CLM! (#38967 ) Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details * working locally; need to style and test * added docs and initial tests; need to debug and flesh out * fixed tests * working long context; batches * working fa2 and eager * update tests * add missing confnigs * remove default autoset * fix spacing * fix most tests * fixed tests * fix to init * refactor to match new transformers updates * remove static cache option * fa2 fix * fix docs * in progress * working on tests * fixed issue with attn outputs * remove debug * fix local config attr * update doc string * fix docstring * add docs to toc * correct typo in toc * add new updates from main w.r.t. ModernBERT RoPE * fix local param --------- Co-authored-by: oweller2 <oweller2@dsailogin.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l07.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@n02.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l08.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l01.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l02.mgmt.ai.cluster> v4.53.2-modernbert-decoder-preview	2025-07-15 10:40:41 +02:00
Alvaro Bartolome	0b724114cf	Fix typo in `/v1/models` output payload (#39414 )	2025-07-15 08:59:25 +01:00
Raushan Turganbay	8d6259b0b8	[refactor] set attention implementation (#38974 ) * update * fix some tests * init from config, changes it in-place, add deepcopy in tests * fix modernbert * don't delete thsi config attr * update * style and copies * skip tests in generation * fix style * accidentally removed flash-attn-3, revert * docs * forgot about flags set to False * fix copies * address a few comments * fix copies * custom code BC	2025-07-15 09:34:06 +02:00
Sameeraja Shyam	6017f5e8ed	[siglip] fix pooling comment (#39378 ) * feat(siglip2): add forward pass with pooled output logic in Siglip2TextModel * test(siglip2): add test_text_model.py to verify pooled output behavior * style(siglip2): fix formatting in test_text_model.py using Ruff * fix(siglip2): remove misleading 'sticky EOS' comment and sync modular-classic files * fix(siglip2): remove misleading 'sticky EOS' comment and sync modular-classic files * chore(siglip2): regenerate classic model after modular change * Update	2025-07-14 17:47:19 +00:00
Tanuj Rai	8d40ca5749	Update phi4_multimodal.md (#38830 ) * Update phi4_multimodal.md * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-14 10:35:17 -07:00
MilkClouds	3635415af2	[Docs] Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic (#39391 ) Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic	2025-07-14 09:25:06 -07:00

1 2 3 4 5 ...

19670 Commits