HuggingFace_transformer

SUMIN/HuggingFace_transformer

Fork 0

Commit Graph

Select branches

Hide Pull Requests

ko-bigbird.md

ko-deepseek_v3.md

ko-feature_extractors.md

ko-xclip.md

main

0.1.2

0.5.0

1.0

1.1.0

1.2.0

3.0.1

4.3.0.rc1

4.54.1

4.55.0-GLM-4.5V-preview

4.55.0-GLM-4.5V-preview-GLM-4.5V-preview

list

localattn1

v0.1.2

v0.2.0

v0.3.0

v0.4.0

v0.5.0

v0.5.1

v0.6.0

v0.6.1

v0.6.2

v1.0.0

v2.0.0

v2.1.0

v2.1.1

v2.10.0

v2.11.0

v2.2.0

v2.2.1

v2.2.2

v2.3.0

v2.4.0

v2.4.1

v2.5.0

v2.5.1

v2.6.0

v2.7.0

v2.8.0

v2.9.0

v2.9.1

v3.0.0

v3.0.1

v3.0.2

v3.1.0

v3.2.0

v3.3.0

v3.3.1

v3.4.0

v3.5.0

v3.5.1

v4.0.0

v4.0.0-rc-1

v4.0.1

v4.1.0

v4.1.1

v4.10.0

v4.10.1

v4.10.2

v4.10.3

v4.11.0

v4.11.1

v4.11.2

v4.11.3

v4.12.0

v4.12.1

v4.12.2

v4.12.3

v4.12.4

v4.12.5

v4.13.0

v4.14.0

v4.14.1

v4.15.0

v4.16.0

v4.16.1

v4.16.2

v4.17.0

v4.18.0

v4.19.0

v4.19.1

v4.19.2

v4.19.3

v4.19.4

v4.2.0

v4.2.1

v4.2.2

v4.20.0

v4.20.1

v4.21.0

v4.21.1

v4.21.2

v4.21.3

v4.22.0

v4.22.1

v4.22.2

v4.23.0

v4.23.1

v4.24.0

v4.25.1

v4.26.0

v4.26.1

v4.27.0

v4.27.1

v4.27.2

v4.27.3

v4.27.4

v4.28.0

v4.28.1

v4.29.0

v4.29.1

v4.29.2

v4.3.0

v4.3.0.rc1

v4.3.1

v4.3.2

v4.3.3

v4.30.0

v4.30.1

v4.30.2

v4.31.0

v4.32.0

v4.32.1

v4.33.0

v4.33.1

v4.33.2

v4.33.3

v4.34.0

v4.34.1

v4.35.0

v4.35.1

v4.35.2

v4.36.0

v4.36.1

v4.36.2

v4.37.0

v4.37.1

v4.37.2

v4.38.0

v4.38.1

v4.38.2

v4.39.0

v4.39.1

v4.39.2

v4.39.3

v4.4.0

v4.4.1

v4.4.2

v4.40.0

v4.40.1

v4.40.2

v4.41.0

v4.41.1

v4.41.2

v4.42.0

v4.42.1

v4.42.2

v4.42.3

v4.42.4

v4.43.0

v4.43.1

v4.43.2

v4.43.3

v4.43.4

v4.44.0

v4.44.1

v4.44.2

v4.45.0

v4.45.1

v4.45.2

v4.46.0

v4.46.1

v4.46.2

v4.46.3

v4.47.0

v4.47.1

v4.48.0

v4.48.1

v4.48.2

v4.48.3

v4.49.0

v4.49.0-AyaVision

v4.49.0-Gemma-3

v4.49.0-Mistral-3

v4.49.0-SigLIP-2

v4.49.0-SmolVLM-2

v4.5.0

v4.5.1

v4.50.0

v4.50.1

v4.50.2

v4.50.3

v4.50.3-DeepSeek-3

v4.50.r3

v4.50.r32

v4.51.0

v4.51.1

v4.51.2

v4.51.3

v4.51.3-BitNet-preview

v4.51.3-CSM-preview

v4.51.3-D-FINE-preview

v4.51.3-GraniteMoeHybrid-preview

v4.51.3-InternVL-preview

v4.51.3-Janus-preview

v4.51.3-LlamaGuard-preview

v4.51.3-MLCD-preview

v4.51.3-Qwen2.5-Omni-preview

v4.51.3-SAM-HQ-preview

v4.51.3-TimesFM-preview

v4.52.0

v4.52.1

v4.52.2

v4.52.3

v4.52.4

v4.52.4-ColQwen2-preview

v4.52.4-Kyutai-STT-preview

v4.52.4-VJEPA-2-preview

v4.53.0

v4.53.1

v4.53.2

v4.53.2-Ernie-4.5-preview

v4.53.2-modernbert-decoder-preview

v4.53.3

v4.54-release

v4.54.0

v4.55.0

v4.55.1

v4.55.2

v4.55.3

v4.55.4

v4.6.0

v4.6.1

v4.7.0

v4.8.0

v4.8.1

v4.8.2

v4.9.0

v4.9.1

v4.9.2

ccb1d06ecf Convert binary/image/model files to Git LFS pointers ko-deepseek_v3.md SUMIN 2026-04-11 01:50:39 +09:00
e4b809e5b2 Add Git LFS tracking for binary/model/image files SUMIN 2026-04-11 01:45:26 +09:00
e52c5890d1 add_toctree.yml ssum21 2025-08-30 15:57:15 +09:00
b80c173b8f Update docs/source/ko/model_doc/deepseek_v3.md SSUM 2025-08-27 18:54:00 +09:00
15b4988bb7 Update docs/source/ko/model_doc/deepseek_v3.md SSUM 2025-08-27 18:53:52 +09:00
231653db22 Merge branch 'main' into ko-deepseek_v3.md SSUM 2025-08-27 13:54:56 +09:00
ff8b88a948 Fix nightly torch CI (#40469) Yih-Dar 2025-08-26 22:02:15 +02:00
74ad608a2b Not to shock AMD team by the cancelled workflow run notification ❤️ 💖 (#40467) Yih-Dar 2025-08-26 20:53:24 +02:00
c8c7623f20 Update SegFormer model card (#40417) SowmiyaNarayanan G 2025-08-26 10:27:25 -05:00
78f32c3917 [pipeline] Add Keypoint Matching pipeline (#39970) StevenBucaille 2025-08-26 10:26:57 -04:00
6451294f6f [RoPE] explicit factor > implicit factor in YaRN (#40320) Joao Gante 2025-08-26 14:58:28 +01:00
5a8ba87ecf [fast_image_processor] fix image normalization for resize (#40436) audioXD 2025-08-26 15:49:51 +02:00
0ce6709e70 deci gguf support (#38669) VED 2025-08-26 19:13:17 +05:30
263d06fedc Fix extra template loading (#40455) Matt 2025-08-26 14:01:01 +01:00
58cebc848b flash_paged: s_aux may not exist (#40434) Pedro Cuenca 2025-08-26 13:15:59 +02:00
34108a2230 Continuous batching refactor (#40426) Rémi Ouazan 2025-08-26 13:01:42 +02:00
49e168ff08 🚨 Remove Contrastive Search decoding strategy (#40428) Manuel de Prada Corral 2025-08-26 12:31:46 +02:00
b8184b7ce9 Make cache_config not mandatory (#40316) Rémi Ouazan 2025-08-26 12:06:17 +02:00
32fcc24667 rename get_cuda_warm_up_factor to get_accelerator_warm_up_factor (#40363) Yao Matrix 2025-08-26 02:56:35 -07:00
f690a2a1e0 [video processors] decode only sampled videos -> less RAM and faster processing (#39600) Raushan Turganbay 2025-08-26 11:38:02 +02:00
64ae6e6b1d fix qwen25-vl grad acc (#40333) Xin Yao 2025-08-26 17:30:06 +08:00
6d2bb1e04d [Trainer] accelerate contextparallel support in trainer (#40205) Kashif Rasul 2025-08-26 11:28:48 +02:00
63caaea1fb Refactor ViT-like models (#39816) Pavel Iakubovskii 2025-08-26 10:14:06 +01:00
922e65b3fc Fix non FA2 tests after FA2 installed in CI docker image (#40430) Yih-Dar 2025-08-26 10:36:50 +02:00
511d3d1683 fix: manual edits ko-bigbird.md ssum21 2025-08-26 11:39:09 +09:00
4d8e46e151 feat: nmt draft ssum21 2025-08-26 11:25:45 +09:00
7788be8497 docs: ko: BigBird.md ssum21 2025-08-26 11:11:12 +09:00
e68146fbe7 Fix collated reports model name entry (#40441) main ivarflakstad 2025-08-25 22:36:01 +02:00
8ce633cc75 InternVL MI325 test expectations (#40387) Ákos Hadnagy 2025-08-25 22:00:35 +02:00
7637d298b3 Fix collated reports uploading (#40440) ivarflakstad 2025-08-25 21:49:59 +02:00
fa59cf9c9f Fix https://github.com/huggingface/transformers/issues/40292 (#40439) id01 2025-08-25 12:12:57 -07:00
f0e87b436d Fix collated reports model directory traversal (#40437) ivarflakstad 2025-08-25 20:01:58 +02:00
ef406902bf Gemma3 text fixes: Add expectations for MI325 (#40384) Ákos Hadnagy 2025-08-25 19:57:50 +02:00
c81723d31b 🌐 [i18n-KO] Translated models.md to Korean (#39518) Judy 2025-08-26 01:17:08 +09:00
6b5eab70e4 Remove working-dir from collated reports job (#40435) ivarflakstad 2025-08-25 18:14:35 +02:00
1763ef2951 [docs] remove last references to transformers TF classes/methods (#40429) Joao Gante 2025-08-25 16:30:59 +01:00
eac4f00bdf Fix typo and improve GPU kernel check error message in MXFP4 quantization (#40349) (#40408) Olumayowa Akinkuehinmi 2025-08-25 16:21:55 +01:00
d8f2edcc46 Add tokenizer_kwargs argument to the text generation pipeline (#40364) Joshua Chin 2025-08-25 08:21:19 -07:00
1a35d07f56 Update collated reports working directory and --path (#40433) ivarflakstad 2025-08-25 17:18:26 +02:00
399cd5c04b Fix modular for modernbert-decoder (#40431) Cyril Vallez 2025-08-25 16:50:49 +02:00
ea8d9c8f06 🚨 Remove DoLa decoding strategy (#40082) Manuel de Prada Corral 2025-08-25 16:33:27 +02:00
6bf6f8490c [Mxfp4] Add a way to save with a quantization method (#40176) Arthur 2025-08-25 16:27:19 +02:00
04c2bae3a8 Fix label smoothing incompatibility with multi-label classification (#40296) Andrew Chauzov 2025-08-25 16:23:31 +02:00
3b5b9f6518 Fix processing tests (#40379) Raushan Turganbay 2025-08-25 14:50:54 +02:00
a0a37b3250 Gpt oss optim (#40304) jiqing-feng 2025-08-25 20:36:33 +08:00
d73181b3fc Fix UnboundLocalError in WER metric computation (#40402) ρrαnαm 2025-08-25 08:02:22 -04:00
11e12a715a Fix typo: 'seperator' to 'separator' in variable names (#40389) Prawal Sharma 2025-08-25 06:56:30 -05:00
40299134a8 Fix CI (hunyuan moe does not support fullgraph) (#40423) Cyril Vallez 2025-08-25 12:01:28 +02:00
a2b37bfd58 Fix typo: 'casual' -> 'causal' in code and documentation (#40371) (#40407) Olumayowa Akinkuehinmi 2025-08-25 10:32:15 +01:00
0031c044f8 [docs] flax/jax purge (#40372) Joao Gante 2025-08-25 10:25:00 +01:00
14b89fed24 fix to accept cumulative_seqlens from TransformersKwargs in FA (#40194) Du Wenjie 2025-08-25 17:00:13 +08:00
ba095d387d 🧹 🧹 🧹 Get set decoder cleanup (#39509) Pablo Montalvo 2025-08-25 10:57:56 +02:00
2c55c7fc94 Reactivate a lot of tests skipped for no reason anymore (#40378) Cyril Vallez 2025-08-25 10:44:43 +02:00
4f9b4e62bc Run FA2 tests in CI (#40397) Yih-Dar 2025-08-23 12:30:18 +02:00
28ca27cb2b HF papers in doc (#40381) Quentin Gallouédec 2025-08-22 15:07:08 -07:00
7d88f57fc6 Update README_zh-hans.md (#40380) tardc 2025-08-23 02:22:26 +08:00
29ddcacea3 Rework the Cache documentation (#40373) Cyril Vallez 2025-08-22 17:06:28 +02:00
dab66f15a1 Chat Template Doc Fixes (#40173) Matt 2025-08-22 15:48:33 +01:00
0a21e870c7 Bug Fix: Dynamically set return_lse flag in FlexAttention (#40352) amd-lalithnc 2025-08-22 19:19:26 +05:30
894b2d84b6 Add GptOssForTokenClassification for GPT-OSS models (#40190) Abdelrahman Kaseb 2025-08-22 16:14:46 +03:00
56d68c6706 Addiing ByteDance Seed Seed-OSS (#40272) Fazzie 2025-08-22 20:54:28 +08:00
d79b2d981f v4.55.4 v4.55.4 Arthur 2025-08-22 14:39:20 +02:00
8a6908c10d fix(example): align parameter names with the latest function definition for gdino (#40369) Yonghye Kwon 2025-08-22 21:27:58 +09:00
7db228a92a [configuration] allow to overwrite kwargs from subconfigs (#40241) Raushan Turganbay 2025-08-22 13:31:25 +02:00
19ffe0219d [processor] move commonalities to mixin (#40339) Raushan Turganbay 2025-08-22 13:04:43 +02:00
d8f6d3790a ⚠️⚠️ Use dtype instead of torch_dtype everywhere! (#39782) Cyril Vallez 2025-08-22 12:34:16 +02:00
9c25820978 [pipelines] add support to skip_special_tokens in the main text generation pipelines (#40356) Joao Gante 2025-08-22 11:12:46 +01:00
5c40e7a225 Change multimodal data links to HF hub (#40309) Raushan Turganbay 2025-08-22 11:50:04 +02:00
e018b77c89 wav2vec2 fixes (#40341) Rémi Ouazan 2025-08-22 11:32:29 +02:00
90792b730a Revert "Fix GPT-OSS swiglu_limit not passed in for MXFP4 #40197" The cherry-picked commit does not match the changes nor the PR This reverts commit e75d67ec39. Arthur 2025-08-22 11:21:18 +02:00
a03df6acd4 Fix GPT-OSS swiglu_limit not passed in for MXFP4 (#40197) Daniel Han 2025-08-15 08:04:25 -07:00
d7fe3111ff Fix idefics3 vision embeddings indices dtype (#40360) Isotr0py 2025-08-22 17:10:45 +08:00
cf487cdf1f HunYuan opensource (#39606) yjc9696 2025-08-22 15:59:58 +08:00
8365f70e92 DOCS: Clarification on the use of label_names as an argument to TrainingArguments (#40353) Huzaifa Jawad 2025-08-22 05:19:04 +05:00
7c1169e21f [4/N]more docs to device agnostic (#40355) Yao Matrix 2025-08-21 10:22:26 -07:00
9568b506ed [generate] handle support for cache classes when num enc layers != num dec layers (#40277) Joao Gante 2025-08-21 17:35:18 +01:00
7f38068ae0 Qwen2.5-VL test fixes for ROCm (#40308) Ákos Hadnagy 2025-08-21 18:13:07 +02:00
cb1df4d26a [FA] Fix some model tests (#40350) Anton Vlasjuk 2025-08-21 18:08:21 +02:00
f46f29dd7c Remove more PyTorch 2.2 compatible code (#40337) Yuanyuan Chen 2025-08-21 23:19:53 +08:00
128f42d370 [detection] use consistent dtype for Conditional and DAB DETR positional embeddings (#40300) Aaron Keesing 2025-08-22 02:49:56 +12:00
2121d09239 [serve] add cors warnings (#40112) Joao Gante 2025-08-21 14:32:36 +01:00
b40b834ab1 Clean up XCodec and other codecs (#40348) Eric Bezzam 2025-08-21 15:32:00 +02:00
75aa7c7252 [ModernBert] Prevent the attention mask from being None in ModernBertForSequenceClassification (#35991) Michele Corazza 2025-08-21 15:16:03 +02:00
04b751f07d Fix attention vizualizer (#40285) Pablo Montalvo 2025-08-21 15:13:35 +02:00
1e1db12304 (small) fix conditional for input_ids and input_embeds in marian (#40045) cyn 2025-08-21 09:13:14 -04:00
7f2f53424e Update test_spm_converter_bytefallback_warning (#40284) Yih-Dar 2025-08-21 14:09:28 +02:00
11a49dd9e3 T5 test and target device fixes (#40313) Ákos Hadnagy 2025-08-21 14:07:29 +02:00
c4513a9fe6 Fix links in Glm4vMoe configuration classes to point to the correct H… (#40310) Eddie Tsai 2025-08-21 19:42:53 +08:00
c7e6f9a485 Fix an infinite loop bug in recursive search of relative imports (#40326) Elad Segal 2025-08-21 14:39:43 +03:00
e95441bdb5 add type hints (#40319) wirthual 2025-08-21 13:19:59 +02:00
5c88d8fbcc Fix: Only call Trainer.align_special_tokens if model has "config" attribute (#40322) Tom Aarsen 2025-08-21 13:06:42 +02:00
c031f6f994 [docs] remove TF references from /en/model_doc (#40344) Joao Gante 2025-08-21 11:53:21 +01:00
7b060e5eb7 Add missing arguments to class constructors (#40068) Yuanyuan Chen 2025-08-21 18:22:38 +08:00
6ad7f29461 Fix deprecation warning version (#40343) Cyril Vallez 2025-08-21 12:18:23 +02:00
adf84aec21 Add DeepseekV3ForSequenceClassification for Deepseek V3 models (#40200) Abdelrahman Kaseb 2025-08-21 13:01:33 +03:00
1e2e28f3c8 Change Qwen2RMSNorm to RMSNorm from PyTorch (#40066) Yuanyuan Chen 2025-08-21 17:58:35 +08:00
022af24fcc Fix qwen-omni processor text only mode (#40336) Yuekai Zhang 2025-08-21 17:57:32 +08:00
c99ed492c7 [docs] remove flax references from /en/model_doc (#40311) Joao Gante 2025-08-21 10:52:54 +01:00
170b2708cb Fixes #40262 v4.55.3 Arthur 2025-08-21 11:03:16 +02:00
c2e3cc24e0 Fix chunked attention mask with left-padding (#40324) Cyril Vallez 2025-08-21 10:52:49 +02:00

1 2 3 4 5 ...