Convert binary/image/model files to Git LFS pointers

Add Git LFS tracking for binary/model/image files
add_toctree.yml
2026-04-11 01:50:39 +09:00 · 2026-04-11 01:45:26 +09:00 · 2025-08-30 15:57:15 +09:00 · 2025-08-27 18:54:00 +09:00 · 2025-08-27 18:53:52 +09:00 · 2025-08-27 13:54:56 +09:00
4392 changed files with 5574 additions and 6550 deletions
--- a/._.DS_Store
+++ b/._.DS_Store
--- a/._.circleci
+++ b/._.circleci
--- a/._.git
+++ b/._.git
--- a/._.gitattributes
+++ b/._.gitattributes
--- a/._.github
+++ b/._.github
--- a/._.gitignore
+++ b/._.gitignore
--- a/._AGENTS.md
+++ b/._AGENTS.md
--- a/._CITATION.cff
+++ b/._CITATION.cff
--- a/._CODE_OF_CONDUCT.md
+++ b/._CODE_OF_CONDUCT.md
--- a/._ISSUES.md
+++ b/._ISSUES.md
--- a/._LICENSE
+++ b/._LICENSE
--- a/._awesome-transformers.md
+++ b/._awesome-transformers.md
--- a/._benchmark
+++ b/._benchmark
--- a/._docker
+++ b/._docker
--- a/._docs
+++ b/._docs
--- a/._examples
+++ b/._examples
--- a/._i18n
+++ b/._i18n
--- a/._notebooks
+++ b/._notebooks
--- a/._scripts
+++ b/._scripts
--- a/._src
+++ b/._src
--- a/._templates
+++ b/._templates
--- a/._tests
+++ b/._tests
--- a/._utils
+++ b/._utils
--- a/.circleci/._TROUBLESHOOT.md
+++ b/.circleci/._TROUBLESHOOT.md
--- a/.circleci/._config.yml
+++ b/.circleci/._config.yml
--- a/.circleci/._parse_test_outputs.py
+++ b/.circleci/._parse_test_outputs.py
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,4 +1,13 @@
 *.py	eol=lf
 *.rst	eol=lf
 *.md	eol=lf
-*.mdx   eol=lf
+*.mdx   eol=lf
+*.model filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text
+*.jpg filter=lfs diff=lfs merge=lfs -text
+*.jpeg filter=lfs diff=lfs merge=lfs -text
+*.gif filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
--- a/.github/._ISSUE_TEMPLATE
+++ b/.github/._ISSUE_TEMPLATE
--- a/.github/._PULL_REQUEST_TEMPLATE.md
+++ b/.github/._PULL_REQUEST_TEMPLATE.md
--- a/.github/._conda
+++ b/.github/._conda
--- a/.github/._scripts
+++ b/.github/._scripts
--- a/.github/._workflows
+++ b/.github/._workflows
--- a/.github/ISSUE_TEMPLATE/._bug-report.yml
+++ b/.github/ISSUE_TEMPLATE/._bug-report.yml
--- a/.github/ISSUE_TEMPLATE/._config.yml
+++ b/.github/ISSUE_TEMPLATE/._config.yml
--- a/.github/ISSUE_TEMPLATE/._feature-request.yml
+++ b/.github/ISSUE_TEMPLATE/._feature-request.yml
--- a/.github/ISSUE_TEMPLATE/._i18n.md
+++ b/.github/ISSUE_TEMPLATE/._i18n.md
--- a/.github/ISSUE_TEMPLATE/._migration.yml
+++ b/.github/ISSUE_TEMPLATE/._migration.yml
--- a/.github/ISSUE_TEMPLATE/._new-model-addition.yml
+++ b/.github/ISSUE_TEMPLATE/._new-model-addition.yml
--- a/.github/conda/._build.sh
+++ b/.github/conda/._build.sh
--- a/.github/conda/._meta.yaml
+++ b/.github/conda/._meta.yaml
--- a/.github/scripts/._assign_reviewers.py
+++ b/.github/scripts/._assign_reviewers.py
--- a/.github/scripts/._codeowners_for_review_action
+++ b/.github/scripts/._codeowners_for_review_action
--- a/.github/workflows/._TROUBLESHOOT.md
+++ b/.github/workflows/._TROUBLESHOOT.md
--- a/.github/workflows/._add-model-like.yml
+++ b/.github/workflows/._add-model-like.yml
--- a/.github/workflows/._assign-reviewers.yml
+++ b/.github/workflows/._assign-reviewers.yml
--- a/.github/workflows/._build-ci-docker-images.yml
+++ b/.github/workflows/._build-ci-docker-images.yml
--- a/.github/workflows/._build-docker-images.yml
+++ b/.github/workflows/._build-docker-images.yml
--- a/.github/workflows/._build-nightly-ci-docker-images.yml
+++ b/.github/workflows/._build-nightly-ci-docker-images.yml
--- a/.github/workflows/._build-past-ci-docker-images.yml
+++ b/.github/workflows/._build-past-ci-docker-images.yml
--- a/.github/workflows/._check_tiny_models.yml
+++ b/.github/workflows/._check_tiny_models.yml
--- a/.github/workflows/._get-pr-info.yml
+++ b/.github/workflows/._get-pr-info.yml
--- a/.github/workflows/._get-pr-number.yml
+++ b/.github/workflows/._get-pr-number.yml
--- a/.github/workflows/._model_jobs_intel_gaudi.yml
+++ b/.github/workflows/._model_jobs_intel_gaudi.yml
--- a/.github/workflows/._new_model_pr_merged_notification.yml
+++ b/.github/workflows/._new_model_pr_merged_notification.yml
--- a/.github/workflows/._pr-style-bot.yml
+++ b/.github/workflows/._pr-style-bot.yml
--- a/.github/workflows/._push-important-models.yml
+++ b/.github/workflows/._push-important-models.yml
--- a/.github/workflows/._release-conda.yml
+++ b/.github/workflows/._release-conda.yml
--- a/.github/workflows/._self-nightly-past-ci-caller.yml
+++ b/.github/workflows/._self-nightly-past-ci-caller.yml
--- a/.github/workflows/._self-past-caller.yml
+++ b/.github/workflows/._self-past-caller.yml
--- a/.github/workflows/._self-push-amd-mi210-caller.yml
+++ b/.github/workflows/._self-push-amd-mi210-caller.yml
--- a/.github/workflows/._self-push-amd-mi250-caller.yml
+++ b/.github/workflows/._self-push-amd-mi250-caller.yml
--- a/.github/workflows/._self-push-amd.yml
+++ b/.github/workflows/._self-push-amd.yml
--- a/.github/workflows/._self-push-caller.yml
+++ b/.github/workflows/._self-push-caller.yml
--- a/.github/workflows/._self-scheduled-amd-caller.yml
+++ b/.github/workflows/._self-scheduled-amd-caller.yml
--- a/.github/workflows/._self-scheduled-amd-mi250-caller.yml
+++ b/.github/workflows/._self-scheduled-amd-mi250-caller.yml
--- a/.github/workflows/._self-scheduled-intel-gaudi.yml
+++ b/.github/workflows/._self-scheduled-intel-gaudi.yml
--- a/.github/workflows/._self-scheduled-intel-gaudi3-caller.yml
+++ b/.github/workflows/._self-scheduled-intel-gaudi3-caller.yml
--- a/.github/workflows/._ssh-runner.yml
+++ b/.github/workflows/._ssh-runner.yml
--- a/.github/workflows/._stale.yml
+++ b/.github/workflows/._stale.yml
--- a/.github/workflows/._trufflehog.yml
+++ b/.github/workflows/._trufflehog.yml
--- a/.github/workflows/._update_metdata.yml
+++ b/.github/workflows/._update_metdata.yml
--- a/.github/workflows/._upload_pr_documentation.yml
+++ b/.github/workflows/._upload_pr_documentation.yml
--- a/.github/workflows/self-nightly-caller.yml
+++ b/.github/workflows/self-nightly-caller.yml
@@ -12,12 +12,34 @@ on:
    branches:
      - run_ci_with_nightly_torch*

+# Used for `push` to easily modify the target workflow runs to compare against
+env:
+    prev_workflow_run_id: ""
+    other_workflow_run_id: ""
+
+
 jobs:
  build_nightly_torch_ci_images:
    name: Build CI Docker Images with nightly torch
    uses: ./.github/workflows/build-nightly-ci-docker-images.yml
    secrets: inherit

+  setup:
+    name: Setup
+    runs-on: ubuntu-22.04
+    steps:
+      - name: Setup
+        run: |
+          mkdir "setup_values"
+          echo "${{ inputs.prev_workflow_run_id || env.prev_workflow_run_id }}" > "setup_values/prev_workflow_run_id.txt"
+          echo "${{ inputs.other_workflow_run_id || env.other_workflow_run_id }}" > "setup_values/other_workflow_run_id.txt"
+
+      - name: Upload artifacts
+        uses: actions/upload-artifact@v4
+        with:
+          name: setup_values
+          path: setup_values
+
  model-ci:
    name: Model CI
    needs: build_nightly_torch_ci_images
--- a/.github/workflows/slack-report.yml
+++ b/.github/workflows/slack-report.yml
@@ -36,7 +36,7 @@ jobs:
  send_results:
    name: Send results to webhook
    runs-on: ubuntu-22.04
-    if: always()
+    if: always() && !cancelled()
    steps:
      - name: Preliminary job status
        shell: bash
--- a/benchmark/._README.md
+++ b/benchmark/._README.md
--- a/benchmark/._init.py
+++ b/benchmark/._init.py
--- a/benchmark/._benchmark.py
+++ b/benchmark/._benchmark.py
--- a/benchmark/._config
+++ b/benchmark/._config
--- a/benchmark/._default.yml
+++ b/benchmark/._default.yml
--- a/benchmark/._grafana_dashboard.json
+++ b/benchmark/._grafana_dashboard.json
--- a/benchmark/._grafana_datasource.yaml
+++ b/benchmark/._grafana_datasource.yaml
--- a/benchmark/._optimum_benchmark_wrapper.py
+++ b/benchmark/._optimum_benchmark_wrapper.py
--- a/docker/._README.md
+++ b/docker/._README.md
--- a/docker/._transformers-all-latest-gpu
+++ b/docker/._transformers-all-latest-gpu
--- a/docker/._transformers-doc-builder
+++ b/docker/._transformers-doc-builder
--- a/docker/._transformers-gpu
+++ b/docker/._transformers-gpu
--- a/docker/._transformers-past-gpu
+++ b/docker/._transformers-past-gpu
--- a/docker/._transformers-pytorch-amd-gpu
+++ b/docker/._transformers-pytorch-amd-gpu
--- a/docker/._transformers-pytorch-deepspeed-amd-gpu
+++ b/docker/._transformers-pytorch-deepspeed-amd-gpu
--- a/docker/._transformers-pytorch-deepspeed-latest-gpu
+++ b/docker/._transformers-pytorch-deepspeed-latest-gpu
--- a/docker/._transformers-pytorch-deepspeed-nightly-gpu
+++ b/docker/._transformers-pytorch-deepspeed-nightly-gpu
--- a/docker/._transformers-pytorch-gpu
+++ b/docker/._transformers-pytorch-gpu
--- a/docker/._transformers-pytorch-tpu
+++ b/docker/._transformers-pytorch-tpu
--- a/docker/._transformers-pytorch-xpu
+++ b/docker/._transformers-pytorch-xpu
--- a/docker/._transformers-quantization-latest-gpu
+++ b/docker/._transformers-quantization-latest-gpu
--- a/docker/._transformers-tensorflow-gpu
+++ b/docker/._transformers-tensorflow-gpu
--- a/docker/transformers-all-latest-gpu/Dockerfile
+++ b/docker/transformers-all-latest-gpu/Dockerfile
@@ -32,7 +32,10 @@ RUN python3 -m pip uninstall -y flax jax

 RUN python3 -m pip install --no-cache-dir -U timm

-RUN python3 -m pip install --no-cache-dir git+https://github.com/facebookresearch/detectron2.git pytesseract
+RUN [ "$PYTORCH" != "pre" ] && python3 -m pip install --no-cache-dir git+https://github.com/facebookresearch/detectron2.git || echo "Don't install detectron2 with nightly torch"
+
+RUN python3 -m pip install --no-cache-dir pytesseract
+
 RUN python3 -m pip install -U "itsdangerous<2.1.0"

 RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
@@ -52,7 +55,7 @@ RUN python3 -m pip install --no-cache-dir bitsandbytes
 RUN python3 -m pip install --no-cache-dir quanto

 # After using A10 as CI runner, let's run FA2 tests
-RUN python3 -m pip uninstall -y ninja && python3 -m pip install --no-cache-dir ninja && python3 -m pip install flash-attn --no-cache-dir --no-build-isolation
+RUN [ "$PYTORCH" != "pre" ] && python3 -m pip uninstall -y ninja && python3 -m pip install --no-cache-dir ninja && python3 -m pip install flash-attn --no-cache-dir --no-build-isolation || echo "Don't install FA2 with nightly torch"

 # TODO (ydshieh): check this again
 # `quanto` will install `ninja` which leads to many `CUDA error: an illegal memory access ...` in some model tests
--- a/docker/transformers-doc-builder/._Dockerfile
+++ b/docker/transformers-doc-builder/._Dockerfile
--- a/docker/transformers-gpu/._Dockerfile
+++ b/docker/transformers-gpu/._Dockerfile
--- a/docker/transformers-past-gpu/._Dockerfile
+++ b/docker/transformers-past-gpu/._Dockerfile
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
SUMIN	ccb1d06ecf	Convert binary/image/model files to Git LFS pointers Some checks failed Secret Leaks / trufflehog (push) Has been cancelled Details	2026-04-11 01:50:39 +09:00
SUMIN	e4b809e5b2	Add Git LFS tracking for binary/model/image files	2026-04-11 01:45:26 +09:00
ssum21	e52c5890d1	add_toctree.yml	2025-08-30 15:57:15 +09:00
SSUM	b80c173b8f	Update docs/source/ko/model_doc/deepseek_v3.md Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com>	2025-08-27 18:54:00 +09:00
SSUM	15b4988bb7	Update docs/source/ko/model_doc/deepseek_v3.md Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com>	2025-08-27 18:53:52 +09:00
SSUM	231653db22	Merge branch 'main' into ko-deepseek_v3.md	2025-08-27 13:54:56 +09:00
Yih-Dar	ff8b88a948	Fix nightly torch CI (#40469 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-26 22:02:15 +02:00
Yih-Dar	74ad608a2b	Not to shock AMD team by the cancelled workflow run notification ❤️ 💖 (#40467 )	2025-08-26 20:53:24 +02:00
SowmiyaNarayanan G	c8c7623f20	Update SegFormer model card (#40417 ) * Update SegFormer model card * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/segformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update the segformer model card * Remove quantization example --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-08-26 08:27:25 -07:00
StevenBucaille	78f32c3917	[pipeline] Add Keypoint Matching pipeline (#39970 ) * feat: keypoint-matcher pipeline * docs: added keypoint-matcher pipeline in docs * fix: added missing statements for repo consistency * docs: updated SuperGlue, LightGlue and EfficientLoFTR docs * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * test: fixed run_pipeline_test * update pipeline typing and docs * update tests * update docs snippets * Fix import error * fix: pipeline init * pt framework --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-08-26 15:26:57 +01:00
Joao Gante	6451294f6f	[RoPE] explicit factor > implicit factor in YaRN (#40320 ) explicit factor > implicit factor	2025-08-26 14:58:28 +01:00
audioXD	5a8ba87ecf	[fast_image_processor] fix image normalization for resize (#40436 )	2025-08-26 13:49:51 +00:00
VED	0ce6709e70	deci gguf support (#38669 ) * deci gguf support * make style * tests for deci * try except removed * style * try except removed	2025-08-26 13:43:17 +00:00
Matt	263d06fedc	Fix extra template loading (#40455 ) * Fix extra template loading * Reformat * Trigger tests	2025-08-26 14:01:01 +01:00
Pedro Cuenca	58cebc848b	flash_paged: s_aux may not exist (#40434 ) Some implementations (i.e., https://huggingface.co/kernels-community/vllm-flash-attn3) support an `s_aux` arg for attention sinks, but others (https://huggingface.co/kernels-community/flash-attn) do not. If s_aux is present in the kwargs, we forward it, otherwise we don't. The user will still get an error if they use a model like gpt-oss-20b with an implementation that does not support `s_aux`, but models that don't use it won't error out. For example, [this is currently failing](`399cd5c04b/examples/pytorch/continuous_batching.py (L16)`) because we are sending `s_aux: None` in the dict.	2025-08-26 13:15:59 +02:00
Rémi Ouazan	34108a2230	Continuous batching refactor (#40426 ) * Rework of the CB example * Further rework of CB example * Refactor PA cache, slice on tokens, add debug prints -- WIP * Slice cache -- WIP * Added a mechanism to check batched outputs in CB script * Less logging, debug flag for slice, !better reset! -- WIP * QOL and safety margins * Refactor and style * Better saving of cb example * Fix * Fixes and QOL * Mor einformations about metrics * Further logging * Style * Licenses * Removed some comments * Add a slice input flag * Fix in example * Added back some open-telemetry deps * Removed some aux function * Added FA2 option to example script * Fixed math (all of it) * Added a simple example * Renamed core to classes * Made allocation of attention mask optionnal * Style	2025-08-26 13:01:42 +02:00
Manuel de Prada Corral	49e168ff08	🚨 Remove Contrastive Search decoding strategy (#40428 ) * delete go brrr * fix tests * review	2025-08-26 12:31:46 +02:00
Rémi Ouazan	b8184b7ce9	Make cache_config not mandatory (#40316 ) * Relaxed assumptions on cache_config * Review compliance * Style * Styyyle * Removed default and added args * Rebase mishapfix * Propagate args to TorchExportableModuleForDecoderOnlyLM * Fix the test I wanted fixed in this PR * Added some AMD expectation related to cache tests	2025-08-26 12:06:17 +02:00
Yao Matrix	32fcc24667	rename get_cuda_warm_up_factor to get_accelerator_warm_up_factor (#40363 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-08-26 09:56:35 +00:00
Raushan Turganbay	f690a2a1e0	[video processors] decode only sampled videos -> less RAM and faster processing (#39600 ) * draft update two models for now * batch update all VLMs first * update some more image processors * update * fix a few tests * just make CI green for now * fix copies * update once more * update * unskip the test * fix these two * fix torchcodec audio loading * maybe * yay, i fixed torchcodec installation and now can actually test it * fix copies deepseek * make sure the metadata is returrned when users request it * add docs * update * fixup * Update src/transformers/audio_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/glm4v/video_processing_glm4v.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update * what if we set some metadata attr to `None` * fix CI * fix one test * fix 4 channel test * fix glm timestemps * rebase gone wrong * raise warning once * fixup * typo * fix copies * ifx smolvlm test * this is why torch's official benchmark was faster, set threads to `0` * Apply style fixes --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-08-26 11:38:02 +02:00
Xin Yao	64ae6e6b1d	fix qwen25-vl grad acc (#40333 ) * fix qwen25—vl grad acc * fix Qwen2_5_VLForConditionalGeneration for accepts_loss_kwargs * fix ci * fix ci * fix typo * fix CI	2025-08-26 09:30:06 +00:00
Kashif Rasul	6d2bb1e04d	[Trainer] accelerate contextparallel support in trainer (#40205 ) * initial context_parallel_size support in trainer * For context parallelism, use AVG instead of SUM to avoid over-accounting tokens * use parallelism_config.cp_enabled * add parallelism_config to trainer state * warn when auto-enabling FSDP * fix some reviews * WIP: somewhat matching loss * Feat: add back nested_gather * Feat: cleanup * Fix: raise on non-sdpa attn * remove context_parallel_size from TrainingArguments * if we have parallelism_config, we defer to get_state_dict from accelerate * fix form review * Feat: add parallelism config support * Chore: revert some unwanted formatting changes * Fix: check None * Check none 2 * Fix: remove duplicate import * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Fin * require accerelate 1.10.1 and higer --------- Co-authored-by: S1ro1 <matej.sirovatka@gmail.com> Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-08-26 09:28:48 +00:00
Pavel Iakubovskii	63caaea1fb	Refactor ViT-like models (#39816 ) * refactor vit * fix * fixup * turn off FX tests * AST * deit * dinov2 * dinov2_with_registers * dpt * depth anything (nit) * depth pro (nit) * ijepa * ijepa (modular) * prompt_depth_anything (nit) * vilt (nit) * zoedepth (nit) * videomae * vit_mae * vit_msn * vivit * yolos * eomt * vitpose * update auto backbone * disable `fx` and export tests (dnov2, dpt, ijepa, vit, vitpose) * fix kwargs for backbone * fix * convnext * fixup * update convnext layernorm * fix-copies layer_norm * convnextv2 * explicit output_hidden_states for models with backbones * explicit hidden states collection for dinov2 * tests fixed * fix DPT as well * fix dinov2 with registers * add comment	2025-08-26 11:14:06 +02:00
Yih-Dar	922e65b3fc	Fix non FA2 tests after FA2 installed in CI docker image (#40430 ) * up * up * up * up * up * up * up * up * up * up * up * up * up --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-26 10:36:50 +02:00
ssum21	795ae8f282	docs : 4N3MONE recommandced modified contents	2025-08-09 20:07:42 -07:00
ssum21	bdba1f83a8	fix: glossary edits	2025-07-25 11:06:11 -07:00
ssum21	61eb8b32cc	fix: manual edits	2025-07-24 16:07:28 -07:00
ssum21	4d297c2e8c	feat: nmt draft	2025-07-24 15:57:31 -07:00
ssum21	0dc80fcdad	docs: ko: deepseek_v3.md	2025-07-24 15:54:56 -07:00