HuggingFace_transformer

Author	SHA1	Message	Date
Ryan McConville	4ba0989eab	Clarify error message to ensure min 28x28 image supplied for Qwen 2.5 VL (#37264 ) fix: clarify error message for min 28x28 images Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-04 12:53:38 +01:00
Yih-Dar	352ec8ef22	pin specific `natten` version in docker file (#37274 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-04 13:47:16 +02:00
cyyever	edd345b52e	Fix deprecated PT functions (#37237 ) * Fix deprecated PT functions Signed-off-by: cyy <cyyever@outlook.com> * Revert some changes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-04 12:31:11 +01:00
Yih-Dar	b016de1ae4	Fix `utils/check_bad_commit.py` (#37272 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-04 12:18:20 +02:00
Nikos Antoniou	f74d7da836	Introduce modular files for speech models (#35902 ) * WAV_2_VEC_2 to WAV2VEC2 * added modular files for hubert, wavlm, wav2vec2_bert, data2vec_audio * remove unnessary definitions in modulars * added modular files for UniSpeech, UniSpeechSat, Wav2Vec2Conformer * docstring fix for UniSpeechForCTC * removed unneccessary re-definition of modular classes * reverted lazy imports change on modular_model_converter, type-alias for Wav2Vec2BaseModelOutput * top-level import of deepspeed in seamless_m4t, speecht5 * avoid tracking imports inside classes, relocate lazy deepspeed, peft imports in their original locations * convert modular * tiny modular typing fixes * some more modular fixes * make style --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>	2025-04-04 11:46:27 +02:00
Ita Zaporozhets	d130cd0e16	update error msg (#37207 )	2025-04-04 10:21:30 +02:00
Raushan Turganbay	41b9b92b52	[qwen-vl] fix image processor (#37258 ) * fix * add test	2025-04-03 19:48:56 +02:00
Surya Garikipati	8dd0a2b89c	Update model card for electra (#37063 ) * Update ELECTRA model card with new format * Update ELECTRA model card with new format * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/electra.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * close hfoption block --------- Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 10:45:35 -07:00
Parag Ekbote	15ac2b6ac5	Update Model Card for ModernBERT (#37052 ) * Modify Model Card for ModernBERT. * Update as per code review. Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update model card. * Update model card. --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 10:14:02 -07:00
Abhishek Ranjan	b552708694	chore: Update model doc for code_llama (#37115 ) * Update code_llama.md aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598 sub part of https://github.com/huggingface/transformers/issues/36979 * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * make changes as per code review * chore: make the function smaller for attention mask visualizer * chore[docs]: update code_llama.md with some more suggested changes * Update docs/source/en/model_doc/code_llama.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * chore[docs] : Update code_llama.md with indentation changes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 10:09:41 -07:00
Bimal Gajera	2b84831a93	Update model card for Cohere (#37056 ) * Update Cohere model card to follow standard template * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update cohere.md Update code snippet for AutoModel, quantization, and transformers-cli * Update cohere.md * Update docs/source/en/model_doc/cohere.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 09:51:40 -07:00
Matt	2d46a08b63	Purge unused ModelTester code (#37085 ) * Purge correctly this time * Remove more methods from recent PRs * make fixup	2025-04-03 17:48:35 +01:00
Avigyan Sinha	1b29409d89	feat: updated model card for qwen_2.5_vl (#37099 ) * feat: updated model card for qwen_2.5_vl * applied suggested change 1 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applied suggested change 2 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applied suggested change 3 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: made requested changes for quantization and notes * suggeested model card change 4 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated model card wiht suggested change 5 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated model card wiht suggested change 6 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated model card wiht suggested change 7 Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feat: applied requested changes --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-03 09:13:26 -07:00
cyyever	8a828a747e	Add Optional to types (#37163 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-03 16:38:01 +01:00
Ryan Mullins	3f6af96732	Adding links to ShieldGemma 2 technical report (#37247 )	2025-04-03 16:26:29 +01:00
Joao Gante	9a1c1fe7ed	[CI] green llama tests (#37244 ) * green llama tests * use cleanup instead * better test comment; cleanup upgrade * better test comment; cleanup upgrade	2025-04-03 14:15:53 +01:00
Matt	782d7d945d	Allow flexible generation params arg when checking pipeline specs (#37211 ) * Allow flexible generation params arg * Trigger tests * Add docstring and rename js_generate to hub_generate	2025-04-03 13:29:36 +01:00
Jaime Fraustro	afafb84b59	Add support for fast image processing in image-pretraining example (#37021 ) * Add support for fast image processing in image-pretraining example Fix typo: correct tuple formatting in IMAGE_PROCESSOR_MAPPING_NAMES Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com> * Use fast image processor by default Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com> --------- Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-04-03 13:26:46 +01:00
Matt	34ccfebf32	Fix AST parsing when looking for remote code imports (#37245 ) * Not all Call.func nodes have id because they can be methods * Trigger tests * Trigger tests	2025-04-03 13:00:51 +01:00
Yao Matrix	f697b3f824	enable 2 types of case on XPU (#37198 ) enable 2 types of case on XPU 1. test_resize_tokens_embeddings_with_deepspeed_multi_gpu 2. test_resize_embeddings_untied_with_deepspeed_multi_gpu Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-03 11:37:55 +02:00
Joao Gante	2099287a59	[CI] lazy loading external datasets (#37218 )	2025-04-03 09:57:45 +01:00
Fanli Lin	a0803a9555	[tests] fix mamba integration simple inference precision issue (#37193 ) * fix precision issue * use float32	2025-04-03 10:38:03 +02:00
Cyril Vallez	6ce238fe7a	Fix test (#37213 ) * Update test_modeling_common.py * style	2025-04-03 10:24:34 +02:00
regisss	12048990a9	Add new dim to `num_items_in_batch` if necessary (#36967 ) * Add new dim to `num_items_in_batch` if necessary * Unsqueeze only in the DP case --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-03 09:57:03 +02:00
Raushan Turganbay	98601cc818	[Phi4] add multimodal chat template (#36996 ) * phi4 chat template * remove from valid kwargs	2025-04-03 09:52:09 +02:00
Guang Yang	c9302c0983	Fix static cache export (#37229 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2025-04-03 07:05:57 +02:00
ARAVINDHAN T	2056287940	Updated model card for Qwen2 (#37192 ) * Update qwen2.md * Update qwen2.md * Update qwen2.md * Update qwen2.md * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update qwen2.md * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-02 18:10:41 -07:00
Ricardo Alanis	3e96a0c32b	Update falcon model card (#37184 ) * feat: updated model card for falcon * fix:rewrite model description * fix: add link to conversion script * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: Add suggested changes * fix: typo in link for quantization * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/falcon.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: fix indent and close ticks * fix: add indent --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-02 17:30:37 -07:00
Purusharth Malik	199d7adf10	Updated the model card for CLIP (#37040 ) * Update clip.md * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Incorporated suggested changes * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/clip.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-04-02 14:57:38 -07:00
Matt	126abe3461	More ReDOS fixes! (#36964 ) * More ReDOS fixes! * Slight regex cleanup * Cleanup regex replacement * Drop that regex entirely too * The regex didn't match config.json, let's make sure we don't either * Cleanup allowed_value_chars a little * Cleanup the import search * Catch multi-condition blocks too * Trigger tests * Trigger tests	2025-04-02 18:46:14 +01:00
Matt	3d133cc557	Stop DOSing the Hub in the CI (#37209 ) * As the title suggests, stop hammering the same files * make fixup * Use shutil instead of pathlib	2025-04-02 17:19:33 +01:00
Joao Gante	e90d55ebcc	[Tests] add `min_new_tokens` to prevent flaky length checks (#37175 )	2025-04-02 15:24:00 +01:00
Matt	cbfa14823b	No more dtype_byte_size() (#37144 ) * No more dtype_byte_size() * Remove function once again * Fix rebase cruft * Trigger tests	2025-04-02 14:58:38 +01:00
cyyever	7613cf1a45	Add py.typed (#37022 )	2025-04-02 14:17:27 +01:00
cyyever	32c12aaec3	[3/N] Use pyupgrade --py39-plus to improve code (#36936 ) Use pyupgrade --py39-plus to improve code Signed-off-by: cyy <cyyever@outlook.com>	2025-04-02 14:16:06 +01:00
cyyever	764ab0d46a	Merge tensor operations with device transfer operations (#37097 ) * Merge operations with to Signed-off-by: cyy <cyyever@outlook.com> * Use dtype Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-04-02 14:15:23 +01:00
湛露先生	c94c6ed397	Fix some code annotation typos. (#37102 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-04-02 14:00:41 +01:00
Dan Saattrup Nielsen	e94d607c8b	fix: Add 'image-text-to-text' to `TASK_MAPPING` (#37107 ) Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-04-02 14:51:03 +02:00
Yih-Dar	adfc91cd46	Try to avoid/reduce some remaining CI job failures (#37202 ) * try * try * Update tests/pipelines/test_pipelines_video_classification.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-04-02 14:39:57 +02:00
Xavier Dupré	6f5dc9c82e	Fixes DynamicCache export issues due to control flow and inplace modifications (#36652 ) * Remove unnecessary masked_fill in deberta models * Enable some code when exporting but not compiling * add missing import * style * replace if by torch.cond * style * use numel * style * add unit tests * style * change empty value for dynamic cache * replace != [] by numel() * fix import issue * style	2025-04-02 12:04:40 +01:00
Jerry Zhang	a165458901	Add device workaround for int4 weight only quantization after API update (#36980 ) * merge * fix import * format * reformat * reformat --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-02 12:42:22 +02:00
Yih-Dar	ed95493ce0	Skip code `307` in `RequestCounter` (#36953 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-02 11:35:46 +02:00
Raushan Turganbay	211e4dc9a4	[chat-template] fix video loading (#37146 ) * fix * add video * trigger * push new iamges * fix tests * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-02 11:27:50 +02:00
Bowen Bao	800510c67b	[doc] Fix link for Quark quantization page (#37179 ) Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-04-01 20:57:38 +02:00
Cyril Vallez	41f5c3216c	Revert #37031 (#37178 ) Update modeling_utils.py	2025-04-01 19:48:15 +02:00
Cyril Vallez	bc2dea3f54	Fix meta state dict loading with quantizers (#37136 ) Update modeling_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-01 18:45:58 +02:00
Yih-Dar	35253076f4	Avoid pipeline test failing related to Hub call (#37170 ) * cls * cls * cls --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-04-01 18:22:45 +02:00
Yufeng Xu	bf41e54fc8	Fixes the inconsistency of the optionality of attention_mask (#37153 ) * debugging issue 36758 * debugging issue 36758 * debugging issue 36758 * updated attn_mask type specification in _flash_attention_forward * removed pdb * added a blank line * removed indentation	2025-04-01 15:31:10 +01:00
Pavel Iakubovskii	3249c5dc15	Refactor attention for SigLIP based models (#36981 ) * Update Siglip attention implementation * Update tests for Siglip * Remove one level of indentation * Update test to be more specific * Fixup * Idefics2 * Idefics3 * Emu3 * SmolVLM * Phi4 (just init small update) * Idefics2 (test fix) * Update siglip2 tests * Update eager * trigger * Clean up * Transfer inputs to device in test * Fixing test * Fixing test * Revert contiguous * Remove unused is_flash_attn_2_available * Move flaky to specific models	2025-04-01 15:37:25 +02:00
Yao Matrix	24e311f42b	fix XPU UT error case brough by RNG difference btw XPU and CUDA (#37121 ) * fix XPU UT error case brough by RNG difference btw XPU and CUDA Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Revert "enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu" This reverts commit 3ef83a4f0204642daa45fda56e8aca1afed24b4f. --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-01 13:52:55 +01:00

1 2 3 4 5 ...

18502 Commits