Commit Graph

  • ccb1d06ecf Convert binary/image/model files to Git LFS pointers ko-deepseek_v3.md SUMIN 2026-04-11 01:50:39 +09:00
  • e4b809e5b2 Add Git LFS tracking for binary/model/image files SUMIN 2026-04-11 01:45:26 +09:00
  • e52c5890d1 add_toctree.yml ssum21 2025-08-30 15:57:15 +09:00
  • b80c173b8f Update docs/source/ko/model_doc/deepseek_v3.md SSUM 2025-08-27 18:54:00 +09:00
  • 15b4988bb7 Update docs/source/ko/model_doc/deepseek_v3.md SSUM 2025-08-27 18:53:52 +09:00
  • 231653db22 Merge branch 'main' into ko-deepseek_v3.md SSUM 2025-08-27 13:54:56 +09:00
  • ff8b88a948 Fix nightly torch CI (#40469) Yih-Dar 2025-08-26 22:02:15 +02:00
  • 74ad608a2b Not to shock AMD team by the cancelled workflow run notification ❤️ 💖 (#40467) Yih-Dar 2025-08-26 20:53:24 +02:00
  • c8c7623f20 Update SegFormer model card (#40417) SowmiyaNarayanan G 2025-08-26 10:27:25 -05:00
  • 78f32c3917 [pipeline] Add Keypoint Matching pipeline (#39970) StevenBucaille 2025-08-26 10:26:57 -04:00
  • 6451294f6f [RoPE] explicit factor > implicit factor in YaRN (#40320) Joao Gante 2025-08-26 14:58:28 +01:00
  • 5a8ba87ecf [fast_image_processor] fix image normalization for resize (#40436) audioXD 2025-08-26 15:49:51 +02:00
  • 0ce6709e70 deci gguf support (#38669) VED 2025-08-26 19:13:17 +05:30
  • 263d06fedc Fix extra template loading (#40455) Matt 2025-08-26 14:01:01 +01:00
  • 58cebc848b flash_paged: s_aux may not exist (#40434) Pedro Cuenca 2025-08-26 13:15:59 +02:00
  • 34108a2230 Continuous batching refactor (#40426) Rémi Ouazan 2025-08-26 13:01:42 +02:00
  • 49e168ff08 🚨 Remove Contrastive Search decoding strategy (#40428) Manuel de Prada Corral 2025-08-26 12:31:46 +02:00
  • b8184b7ce9 Make cache_config not mandatory (#40316) Rémi Ouazan 2025-08-26 12:06:17 +02:00
  • 32fcc24667 rename get_cuda_warm_up_factor to get_accelerator_warm_up_factor (#40363) Yao Matrix 2025-08-26 02:56:35 -07:00
  • f690a2a1e0 [video processors] decode only sampled videos -> less RAM and faster processing (#39600) Raushan Turganbay 2025-08-26 11:38:02 +02:00
  • 64ae6e6b1d fix qwen25-vl grad acc (#40333) Xin Yao 2025-08-26 17:30:06 +08:00
  • 6d2bb1e04d [Trainer] accelerate contextparallel support in trainer (#40205) Kashif Rasul 2025-08-26 11:28:48 +02:00
  • 63caaea1fb Refactor ViT-like models (#39816) Pavel Iakubovskii 2025-08-26 10:14:06 +01:00
  • 922e65b3fc Fix non FA2 tests after FA2 installed in CI docker image (#40430) Yih-Dar 2025-08-26 10:36:50 +02:00
  • 511d3d1683 fix: manual edits ko-bigbird.md ssum21 2025-08-26 11:39:09 +09:00
  • 4d8e46e151 feat: nmt draft ssum21 2025-08-26 11:25:45 +09:00
  • 7788be8497 docs: ko: BigBird.md ssum21 2025-08-26 11:11:12 +09:00
  • e68146fbe7 Fix collated reports model name entry (#40441) main ivarflakstad 2025-08-25 22:36:01 +02:00
  • 8ce633cc75 InternVL MI325 test expectations (#40387) Ákos Hadnagy 2025-08-25 22:00:35 +02:00
  • 7637d298b3 Fix collated reports uploading (#40440) ivarflakstad 2025-08-25 21:49:59 +02:00
  • fa59cf9c9f Fix https://github.com/huggingface/transformers/issues/40292 (#40439) id01 2025-08-25 12:12:57 -07:00
  • f0e87b436d Fix collated reports model directory traversal (#40437) ivarflakstad 2025-08-25 20:01:58 +02:00
  • ef406902bf Gemma3 text fixes: Add expectations for MI325 (#40384) Ákos Hadnagy 2025-08-25 19:57:50 +02:00
  • c81723d31b 🌐 [i18n-KO] Translated models.md to Korean (#39518) Judy 2025-08-26 01:17:08 +09:00
  • 6b5eab70e4 Remove working-dir from collated reports job (#40435) ivarflakstad 2025-08-25 18:14:35 +02:00
  • 1763ef2951 [docs] remove last references to transformers TF classes/methods (#40429) Joao Gante 2025-08-25 16:30:59 +01:00
  • eac4f00bdf Fix typo and improve GPU kernel check error message in MXFP4 quantization (#40349) (#40408) Olumayowa Akinkuehinmi 2025-08-25 16:21:55 +01:00
  • d8f2edcc46 Add tokenizer_kwargs argument to the text generation pipeline (#40364) Joshua Chin 2025-08-25 08:21:19 -07:00
  • 1a35d07f56 Update collated reports working directory and --path (#40433) ivarflakstad 2025-08-25 17:18:26 +02:00
  • 399cd5c04b Fix modular for modernbert-decoder (#40431) Cyril Vallez 2025-08-25 16:50:49 +02:00
  • ea8d9c8f06 🚨 Remove DoLa decoding strategy (#40082) Manuel de Prada Corral 2025-08-25 16:33:27 +02:00
  • 6bf6f8490c [Mxfp4] Add a way to save with a quantization method (#40176) Arthur 2025-08-25 16:27:19 +02:00
  • 04c2bae3a8 Fix label smoothing incompatibility with multi-label classification (#40296) Andrew Chauzov 2025-08-25 16:23:31 +02:00
  • 3b5b9f6518 Fix processing tests (#40379) Raushan Turganbay 2025-08-25 14:50:54 +02:00
  • a0a37b3250 Gpt oss optim (#40304) jiqing-feng 2025-08-25 20:36:33 +08:00
  • d73181b3fc Fix UnboundLocalError in WER metric computation (#40402) ρrαnαm 2025-08-25 08:02:22 -04:00
  • 11e12a715a Fix typo: 'seperator' to 'separator' in variable names (#40389) Prawal Sharma 2025-08-25 06:56:30 -05:00
  • 40299134a8 Fix CI (hunyuan moe does not support fullgraph) (#40423) Cyril Vallez 2025-08-25 12:01:28 +02:00
  • a2b37bfd58 Fix typo: 'casual' -> 'causal' in code and documentation (#40371) (#40407) Olumayowa Akinkuehinmi 2025-08-25 10:32:15 +01:00
  • 0031c044f8 [docs] flax/jax purge (#40372) Joao Gante 2025-08-25 10:25:00 +01:00
  • 14b89fed24 fix to accept cumulative_seqlens from TransformersKwargs in FA (#40194) Du Wenjie 2025-08-25 17:00:13 +08:00
  • ba095d387d 🧹 🧹 🧹 Get set decoder cleanup (#39509) Pablo Montalvo 2025-08-25 10:57:56 +02:00
  • 2c55c7fc94 Reactivate a lot of tests skipped for no reason anymore (#40378) Cyril Vallez 2025-08-25 10:44:43 +02:00
  • 4f9b4e62bc Run FA2 tests in CI (#40397) Yih-Dar 2025-08-23 12:30:18 +02:00
  • 28ca27cb2b HF papers in doc (#40381) Quentin Gallouédec 2025-08-22 15:07:08 -07:00
  • 7d88f57fc6 Update README_zh-hans.md (#40380) tardc 2025-08-23 02:22:26 +08:00
  • 29ddcacea3 Rework the Cache documentation (#40373) Cyril Vallez 2025-08-22 17:06:28 +02:00
  • dab66f15a1 Chat Template Doc Fixes (#40173) Matt 2025-08-22 15:48:33 +01:00
  • 0a21e870c7 Bug Fix: Dynamically set return_lse flag in FlexAttention (#40352) amd-lalithnc 2025-08-22 19:19:26 +05:30
  • 894b2d84b6 Add GptOssForTokenClassification for GPT-OSS models (#40190) Abdelrahman Kaseb 2025-08-22 16:14:46 +03:00
  • 56d68c6706 Addiing ByteDance Seed Seed-OSS (#40272) Fazzie 2025-08-22 20:54:28 +08:00
  • d79b2d981f v4.55.4 v4.55.4 Arthur 2025-08-22 14:39:20 +02:00
  • 8a6908c10d fix(example): align parameter names with the latest function definition for gdino (#40369) Yonghye Kwon 2025-08-22 21:27:58 +09:00
  • 7db228a92a [configuration] allow to overwrite kwargs from subconfigs (#40241) Raushan Turganbay 2025-08-22 13:31:25 +02:00
  • 19ffe0219d [processor] move commonalities to mixin (#40339) Raushan Turganbay 2025-08-22 13:04:43 +02:00
  • d8f6d3790a ⚠️⚠️ Use dtype instead of torch_dtype everywhere! (#39782) Cyril Vallez 2025-08-22 12:34:16 +02:00
  • 9c25820978 [pipelines] add support to skip_special_tokens in the main text generation pipelines (#40356) Joao Gante 2025-08-22 11:12:46 +01:00
  • 5c40e7a225 Change multimodal data links to HF hub (#40309) Raushan Turganbay 2025-08-22 11:50:04 +02:00
  • e018b77c89 wav2vec2 fixes (#40341) Rémi Ouazan 2025-08-22 11:32:29 +02:00
  • 90792b730a Revert "Fix GPT-OSS swiglu_limit not passed in for MXFP4 #40197" The cherry-picked commit does not match the changes nor the PR This reverts commit e75d67ec39. Arthur 2025-08-22 11:21:18 +02:00
  • a03df6acd4 Fix GPT-OSS swiglu_limit not passed in for MXFP4 (#40197) Daniel Han 2025-08-15 08:04:25 -07:00
  • d7fe3111ff Fix idefics3 vision embeddings indices dtype (#40360) Isotr0py 2025-08-22 17:10:45 +08:00
  • cf487cdf1f HunYuan opensource (#39606) yjc9696 2025-08-22 15:59:58 +08:00
  • 8365f70e92 DOCS: Clarification on the use of label_names as an argument to TrainingArguments (#40353) Huzaifa Jawad 2025-08-22 05:19:04 +05:00
  • 7c1169e21f [4/N]more docs to device agnostic (#40355) Yao Matrix 2025-08-21 10:22:26 -07:00
  • 9568b506ed [generate] handle support for cache classes when num enc layers != num dec layers (#40277) Joao Gante 2025-08-21 17:35:18 +01:00
  • 7f38068ae0 Qwen2.5-VL test fixes for ROCm (#40308) Ákos Hadnagy 2025-08-21 18:13:07 +02:00
  • cb1df4d26a [FA] Fix some model tests (#40350) Anton Vlasjuk 2025-08-21 18:08:21 +02:00
  • f46f29dd7c Remove more PyTorch 2.2 compatible code (#40337) Yuanyuan Chen 2025-08-21 23:19:53 +08:00
  • 128f42d370 [detection] use consistent dtype for Conditional and DAB DETR positional embeddings (#40300) Aaron Keesing 2025-08-22 02:49:56 +12:00
  • 2121d09239 [serve] add cors warnings (#40112) Joao Gante 2025-08-21 14:32:36 +01:00
  • b40b834ab1 Clean up XCodec and other codecs (#40348) Eric Bezzam 2025-08-21 15:32:00 +02:00
  • 75aa7c7252 [ModernBert] Prevent the attention mask from being None in ModernBertForSequenceClassification (#35991) Michele Corazza 2025-08-21 15:16:03 +02:00
  • 04b751f07d Fix attention vizualizer (#40285) Pablo Montalvo 2025-08-21 15:13:35 +02:00
  • 1e1db12304 (small) fix conditional for input_ids and input_embeds in marian (#40045) cyn 2025-08-21 09:13:14 -04:00
  • 7f2f53424e Update test_spm_converter_bytefallback_warning (#40284) Yih-Dar 2025-08-21 14:09:28 +02:00
  • 11a49dd9e3 T5 test and target device fixes (#40313) Ákos Hadnagy 2025-08-21 14:07:29 +02:00
  • c4513a9fe6 Fix links in Glm4vMoe configuration classes to point to the correct H… (#40310) Eddie Tsai 2025-08-21 19:42:53 +08:00
  • c7e6f9a485 Fix an infinite loop bug in recursive search of relative imports (#40326) Elad Segal 2025-08-21 14:39:43 +03:00
  • e95441bdb5 add type hints (#40319) wirthual 2025-08-21 13:19:59 +02:00
  • 5c88d8fbcc Fix: Only call Trainer.align_special_tokens if model has "config" attribute (#40322) Tom Aarsen 2025-08-21 13:06:42 +02:00
  • c031f6f994 [docs] remove TF references from /en/model_doc (#40344) Joao Gante 2025-08-21 11:53:21 +01:00
  • 7b060e5eb7 Add missing arguments to class constructors (#40068) Yuanyuan Chen 2025-08-21 18:22:38 +08:00
  • 6ad7f29461 Fix deprecation warning version (#40343) Cyril Vallez 2025-08-21 12:18:23 +02:00
  • adf84aec21 Add DeepseekV3ForSequenceClassification for Deepseek V3 models (#40200) Abdelrahman Kaseb 2025-08-21 13:01:33 +03:00
  • 1e2e28f3c8 Change Qwen2RMSNorm to RMSNorm from PyTorch (#40066) Yuanyuan Chen 2025-08-21 17:58:35 +08:00
  • 022af24fcc Fix qwen-omni processor text only mode (#40336) Yuekai Zhang 2025-08-21 17:57:32 +08:00
  • c99ed492c7 [docs] remove flax references from /en/model_doc (#40311) Joao Gante 2025-08-21 10:52:54 +01:00
  • 170b2708cb Fixes #40262 v4.55.3 Arthur 2025-08-21 11:03:16 +02:00
  • c2e3cc24e0 Fix chunked attention mask with left-padding (#40324) Cyril Vallez 2025-08-21 10:52:49 +02:00