HuggingFace_transformer

Author	SHA1	Message	Date
Arthur	170b2708cb	Fixes #40262 Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details v4.55.3	2025-08-21 11:03:16 +02:00
Arthur	7dbc054e2a	v4.55.3	2025-08-18 14:46:54 +02:00
Zhen	c097a43898	[bugfix] fix flash-attention2 unavailable error for Ascend NPU (#40151 ) * [bugfix] fix flash-attention2 unavailable error for Ascend NPU * remove redundant apply_rotary_emb usage * fix ruff check error * pad_input and unpad_input use same implementation as fa2 * rollback redundant codes * fix ruff check error * optimize fa2 judgement logic	2025-08-18 14:45:23 +02:00
Cyril Vallez	663cbb0d04	[FA2] Fix it finally - revert fa kwargs preparation (#40161 ) revert	2025-08-18 14:44:58 +02:00
Cyril Vallez	c7bd5350f0	Fix fsdp for generic-task models #40191	2025-08-18 14:44:16 +02:00
Lintch	e75d67ec39	Fix GPT-OSS swiglu_limit not passed in for MXFP4 #40197	2025-08-18 14:43:31 +02:00
Manuel de Prada Corral	d7f67d2006	Fix mamba caches (#40203 ) fix mamba models caches inheritance	2025-08-18 14:27:04 +02:00
Arthur	acf295aec3	v4.55.2 Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details v4.55.2	2025-08-13 20:14:33 +02:00
Arthur Zucker	aaa3169aa2	qfix bad cherry-pick	2025-08-13 18:13:21 +00:00
Arthur	ea2eee0bc8	v4.55.1 Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details v4.55.1	2025-08-13 10:33:42 +02:00
Quentin Gallouédec	956be23fff	[bugfix] Fix tensor device in Idefics2, Idefics3, and SmolVLM (#39975 ) * [bugfix] ensure correct tensor device in Idefics2, Idefics3, and SmolVLM models * to cuda	2025-08-13 10:33:17 +02:00
Anton Vlasjuk	79a9ffc520	fix merge conlicts	2025-08-13 10:25:20 +02:00
Mohamed Mekkouri	99404c7098	Default to dequantize if cpu in device_map for mxfp4 (#39993 ) * default to dq if cpu * an other check * style * revert some changes	2025-08-13 10:22:01 +02:00
Anton Vlasjuk	0d6908038c	[`GPT Big Code`] Fix attention scaling (#40041 ) * fix * update integration tests * fmt * add regression test	2025-08-13 10:22:01 +02:00
Tsumugii	b8e97fbfd2	fix: resolve triton version check compatibility on windows (#39986 ) * fix: resolve triton version check compatibility on windows * style: remove trailing space * fix: fix typo --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-08-13 10:22:01 +02:00
Laurenz Ruzicka	586b6e693b	Fix missing None default values for Gemma3n model in get_placeholder_mask (#39991 ) (#40024 ) * Fix missing None default values for Gemma3n model in get_placeholder_mask (#39991) * Switched definition of optional from\| None to Optiona[] (Issue #39991) --------- Co-authored-by: Laurenz Ruzicka <Laurenz.Ruzicka@ait.ac.at>	2025-08-13 10:22:01 +02:00
Isotr0py	95ae07d11f	Fix broken image inference for Fuyu model (#39915 ) * fix fuyu Signed-off-by: Isotr0py <2037008807@qq.com> * oops Signed-off-by: Isotr0py <2037008807@qq.com> * run test on GPU Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * clean unused Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * revert Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * add fuyu multimodal test Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-13 10:22:01 +02:00
Shuming Hu	0d9032ae71	Fix missing video inputs for PerceptionLM. (#39971 ) * Fix missing video inputs for PerceptionLM. * Minor fix for vanilla input image (only C,H,W, no tiles dim). * Revert "Minor fix for vanilla input image (only C,H,W, no tiles dim)." This reverts commit 181d87b964e59c4118035a9fd4f530c6e551ba9f.	2025-08-13 10:22:01 +02:00
Raushan Turganbay	1d42803aac	[Idefics] fix device mismatch (#39981 ) fix	2025-08-13 10:22:01 +02:00
Marc Sun	382717e543	remove `triton_kernels` dep with `kernels` instead (#39926 ) * remove dep * style * rm import * fix * style * simplify * style	2025-08-13 10:22:01 +02:00
Matthew Douglas	cc98f42d22	Enable gpt-oss mxfp4 on older hardware (sm75+) (#39940 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-13 10:22:01 +02:00
Lintch	d2f7266367	Fix MXFP4 quantizer validation to allow CPU inference with dequantize option (#39953 ) * Fix MXFP4 quantizer validation to enable CPU dequantization Move dequantize check before CUDA availability check to allow CPU inference when quantization_config.dequantize is True. This enables users to run MXFP4 models on CPU by automatically converting them to BF16 format. * Add tests for MXFP4 quantizer CPU dequantization validation * fix: format mxfp4 test file with ruff	2025-08-13 10:22:01 +02:00
Joao Gante	daab2db33f	[CI] post-`GptOss` fixes for green CI (#39929 )	2025-08-07 16:27:00 +02:00
Lysandre	06f8004e5c	Release: v4.55.0 Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details v4.55.0	2025-08-05 18:09:15 +02:00
Lysandre Debut	c54203a32e	gpt_oss last chat template changes (#39925 ) Last chat template changes	2025-08-05 18:08:08 +02:00
Arthur	7c38d8fc23	Add GPT OSS model from OpenAI (#39923 ) * fix * nice * where i am at * Bro this works * Update src/transformers/integrations/tensor_parallel.py * cleanups * yups that was breaking * Update src/transformers/models/openai_moe/modeling_openai_moe.py * gather on experts and not mlp * add changes for latest convert branch * adds options to get output_router_logits from config * bring chat temlate + special tokens back into the script. * initial commmit * update * working with shards * add model.safetensors.index.json * fix * fix * mxfp4 flag * rm print * Fix PAD/EOS/BOS (#18) * fix pad/eos/bos * base model maybe one day * add some doc * special tokens based on harmony. * add in tokenizer config as well. * prepare for rebase with main * Fix for initialize_tensor_parallelism now returning 4-tuple ``` [rank0]: File "/fsx/edward/work/openai-tsm-examples/examples/generate.py", line 17, in <module> [rank0]: model = AutoModelForCausalLM.from_pretrained( [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/fsx/edward/work/new-model-addition-openai/src/transformers/models/auto/auto_factory.py", line 600, in from_pretrained [rank0]: return model_class.from_pretrained( [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/fsx/edward/work/new-model-addition-openai/src/transformers/modeling_utils.py", line 316, in _wrapper [rank0]: return func(args, kwargs) [rank0]: ^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/fsx/edward/work/new-model-addition-openai/src/transformers/modeling_utils.py", line 4748, in from_pretrained [rank0]: tp_plan, device_map, device_mesh = initialize_tensor_parallelism(tp_plan, tp_size=None) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: ValueError: too many values to unpack (expected 3) ``` mxfp4 * mxfp4 draft * fix * fix import * draft * draft impl * finally working ! * simplify * add import * working version * consider blocks and scales * device mesh fix * initial commit * add working dequant + quant logic * update * non nan, gibberish output * working EP + quantization finally ! * start cleaning * remove reversing process * style * some cleaning * initial commmit * more cleaning * more cleaning * simplify * more cleaning * rm duplicated function * changing tp_plan * update tp plan check * add loading attribute * dequantizing logic * use subfunctions * import cleaning * update_param_name * adds clamped swiglu * add clamping to training path * simplify dequant logic * update * Bad merge * more simplifications & tests * fix ! * fix registering custom attention * fix order * fixes * some test nits * nits * nit * fix * Clamp sink logits * Clean * Soft-max trick * Clean up * p * fix deepspeed * update both modeling and modular for cleanup * contiguous * update tests * fix top_k router call * revert renaming * test nits * small fixes for EP * fix path for our local tests * update as I should not have broken that! * fix the loss of mixtral * revert part of the changes related to router_scores, kernel probably no ready for that! * deleting a small nit * update arch * fix post processing * update * running version but not expected output * moving to cuda * initial commit * revert * erroring when loading on cpu * updates * del blocks, scales * fix * style * rm comm * comment * add comment * style * remove duplicated lines * Fix minor issue with weight_map conversion script * fix sampling params * rename to final name * upate pre-final version of template * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py * fix batched inference * serve fixes * swizzle ! * update final chat template by Matt. * fix responses; pin oai * sinplify * Thanks Matt for his tireless efforts! Co-authored-by: Rocketknight1 <Rocketknight1@users.noreply.github.com> * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * fix * Use ROCm kernels from HUB * Make kernel modes explicit * update final chat template by Matt. x2 * Thanks Matt for his tireless efforts! Co-authored-by: Rocketknight1 <Rocketknight1@users.noreply.github.com> * Fix installation * Update setup.py Co-authored-by: Ákos Hadnagy <akos.hadnagy@gmail.com> * allow no content * fix: update message handling in write_tokenizer function * Fix template logic for user message role * last nits for CB and flash_paged! * there was one bad merge * fix CB (hardcode for now, its just using kv groups instead) * fix * better fix for device_map * minor device fix * Fix flash paged * updates * Revert "remove dtensors, not explicit (#39840)" This reverts commit `6dfd561d9c`. * update * Revert "remove dtensors, not explicit (#39840)" This reverts commit `6dfd561d9c`. * fix merge * fix * Fix line break when custom model indentity * nits testing * to locals first and pass sliding window to flash paged * register modes for MegaBlocksMoeMlp * add integration test in fixtures -> now update the tests to use it! * update integration tests * initial fix * style and update tests * fix * chore(gpt oss): remove mlp_bias from configuration It was just a leftover. * stats * Integration tests * whoops * Shouldn't move model * Ensure assistant messages without thinking always go to "final" channel * More checks to ensure expected format * Add pad_token_id to model configuration in write_model function (#51) * Add oai fix fast tests (#59) * Fix some fast tests * Force some updates * Remove unnecessary fixes * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py * reasoning -> Reasoning * Add additional integration tests * fixup * Slight fixes * align chat template with harmony * simplify * Add comment * torch testing assert close * torch testing assert close * torch testing assert close * torch testing assert close * torch testing assert close * torch testing assert close * Revert fixup * skip 2 test remove todo * merge * padding side should be left for integration tests * fix modular wrt to changes made to modeling * style * isort * fix opies for the loss * mmmm --------- Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: edbeeching <edbeeching@gmail.com> Co-authored-by: Vaibhavs10 <vaibhavs10@gmail.com> Co-authored-by: MekkCyber <mekk.cyber@gmail.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com> Co-authored-by: Zhuohan Li <zhuohan@openai.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: joao@huggingface.co <joao@ip-10-53-88-32.ec2.internal> Co-authored-by: Rocketknight1 <Rocketknight1@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Akos Hadnagy <akos@ahadnagy.com> Co-authored-by: Ákos Hadnagy <akos.hadnagy@gmail.com> Co-authored-by: Alvaro Moran <alvaro.moran@huggingface.co> Co-authored-by: Lysandre <hi@lysand.re> Co-authored-by: Matt <rocketknight1@gmail.com>	2025-08-05 18:02:18 +02:00
TaeHyeon Jeon	738c1a3899	🌐 [i18n-KO] Translated `cache_explanation.md` to Korean (#39535 ) * update: _toctree.yml * docs: ko: cache_explanation.md * feat: nmt draft * fix: apply yijun-lee's comments * fix: apply 4N3MONE's comments * docs: update cache_position * docs: update cache-storage-implementation * update: add h2 tag in cache-position --------- Co-authored-by: taehyeonjeon <xogus294@gmail.com>	2025-08-05 08:20:13 -07:00
Guang Yang	d2ae766836	Export SmolvLM (#39614 ) Export SmolVLM for ExecuTorch	2025-08-05 16:20:23 +02:00
ppaanngggg	c430047602	[docs] update object detection guide (#39909 ) * Update object_detection.md * Update object_detection.md	2025-08-05 14:07:21 +00:00
Arthur	dedcbd6e3d	run model debugging with forward arg (#39905 ) * run model debugging a lot simpler * fixup * Update src/transformers/utils/generic.py * fixup * mode syle? * guard a bit	2025-08-05 15:46:19 +02:00
Arthur	20ce210ab7	Revert "remove dtensors, not explicit (#39840 )" (#39912 ) * Revert "remove dtensors, not explicit (#39840)" This did not work with generation (lm_head needs extra care!) This reverts commit `6dfd561d9c`. * update * style?	2025-08-05 15:12:14 +02:00
Raushan Turganbay	2589a52c5c	Fix aria tests (#39879 ) * fix aria tests * awful bug * fix copies * fix tests * fix style * revert this	2025-08-05 13:48:47 +02:00
Justin van Heek	6e4a9a5b43	Fix eval thread fork bomb (#39717 )	2025-08-05 10:50:32 +00:00
Yuanyuan Chen	98a3c49135	Replace video_fps with fps in tests (#39898 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-05 10:39:55 +00:00
nnul	1af1071081	Fix misleading WandB error when WANDB_DISABLED is set (#39891 ) When users set `report_to="wandb"` but also have `WANDB_DISABLED=true` in their environment, the previous error message was misleading: "WandbCallback requires wandb to be installed. Run pip install wandb." This was confusing because wandb was actually installed, just disabled via the environment variable. The fix detects this specific case and provides a clear, actionable error message explaining the conflict and how to resolve it.	2025-08-05 10:18:18 +00:00
Yidi Wu	78ef84921b	Avoid aliasing in cond's branches for torch 2.8 (#39488 ) Avoid alaising in cond's branches Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-05 11:18:11 +02:00
Yuanyuan Chen	9e676e6a0e	[qwen] remove unnecessary CUDA sync in qwen2_5_vl (#39870 ) Signed-off-by: cyy <cyyever@outlook.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-05 08:54:16 +00:00
Yao Matrix	392be3b282	fix test_working_of_tp failure of accelerate ut (#39828 ) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-08-05 08:52:57 +00:00
Arthur	cc5de36454	[`Exaone4`] Fixes the attn implementation! (#39906 ) * fix * fix config	2025-08-05 09:29:16 +02:00
Lysandre Debut	00d47757bf	Reorder serving docs (#39634 ) * Slight reorg * LLMs + draft VLMs * Actual VLM examples * Initial responses * Reorder * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/tiny_agents.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/open_webui.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/cursor.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Responses API * Address Pedro's comments --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-08-05 08:43:06 +02:00
Arpon Kapuria	8c4ea670dc	chore: update DETR model card (#39822 ) * Update model card for DETR * fix: applied suggested changes * fix: simplified pipeline and modified notes and resources * Update detr.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-04 12:25:53 -07:00
Jan Netík	0bd91cc822	Add support for `ModernBertForMultipleChoice` (#39232 ) * implement ModernBertForMultipleChoice * fixup, style, repo consistency * generate modeling_modernbert * add tests + docs * fix test	2025-08-04 20:45:43 +02:00
Yih-Dar	801e869b67	send some feedback when manually building doc via comment (#39889 ) * fix * fix * fix * Update .github/workflows/pr_build_doc_with_comment.yml Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-08-04 18:20:48 +00:00
Yih-Dar	ee7eb2d0b1	Update cohere2 vision test (#39888 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-04 20:08:18 +02:00
rohitthewanderer	3bafa128dc	[DOCS] : Improved mimi model card (#39824 ) * [DOCS] : Improved mimi model card * Removed additional header * Review: addressed feedback * Update mimi.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-04 10:07:06 -07:00
Pavel Iakubovskii	192acc2d0f	Fix link to models in README (#39880 ) Update README.md	2025-08-04 09:34:41 -07:00
Pavel Iakubovskii	7dca2ff8cf	[typing] better return type hint for `AutoModelForCausalLM` and `AutoModelForImageTextToText` (#39881 ) * Better return type hint for AutoModelForCausalLM and AutoModelForImageTextToText * fix imports * fix	2025-08-04 15:03:53 +00:00
Yih-Dar	3edd14610e	Set `torch.backends.cudnn.allow_tf32 = False` for CI (#39885 ) * fix * fix * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-04 16:55:16 +02:00
Quentin Gallouédec	e3505cd4dc	Replace `Tokenizer` with `PreTrainedTokenizerFast` in `ContinuousBatchProcessor` (#39858 ) Replace Tokenizer with PreTrainedTokenizerFast in ContinuousBatchProcessor	2025-08-04 16:39:19 +02:00
Cyril Vallez	380b2a0317	Rework add-new-model-like with modular and make test filenames coherent (#39612 ) * remove tf/flax * fix * style * Update add_new_model_like.py * work in progress * continue * more cleanup * simplify and first final version * fixes -> it works * add linter checks * Update add_new_model_like.py * fix * add modular conversion at the end * Update add_new_model_like.py * add video processor * Update add_new_model_like.py * Update add_new_model_like.py * Update add_new_model_like.py * fix * Update image_processing_auto.py * Update image_processing_auto.py * fix post rebase * start test filenames replacement * rename all test_processor -> test_processing * fix copied from * add docstrings * Update add_new_model_like.py * fix regex * improve wording * Update add_new_model_like.py * Update add_new_model_like.py * Update add_new_model_like.py * start adding test * fix * fix * proper first test * tests * fix * fix * fix * fix * modular can be used from anywhere * protect import * fix * Update add_new_model_like.py * fix	2025-08-04 14:41:09 +02:00

1 2 3 4 5 ...

19936 Commits