Compare commits

...

29 Commits

Author SHA1 Message Date
ccb1d06ecf Convert binary/image/model files to Git LFS pointers
Some checks failed
Secret Leaks / trufflehog (push) Has been cancelled
2026-04-11 01:50:39 +09:00
e4b809e5b2 Add Git LFS tracking for binary/model/image files 2026-04-11 01:45:26 +09:00
ssum21
e52c5890d1 add_toctree.yml 2025-08-30 15:57:15 +09:00
SSUM
b80c173b8f Update docs/source/ko/model_doc/deepseek_v3.md
Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com>
2025-08-27 18:54:00 +09:00
SSUM
15b4988bb7 Update docs/source/ko/model_doc/deepseek_v3.md
Co-authored-by: Kim Juwon <81630351+Kim-Ju-won@users.noreply.github.com>
2025-08-27 18:53:52 +09:00
SSUM
231653db22 Merge branch 'main' into ko-deepseek_v3.md 2025-08-27 13:54:56 +09:00
Yih-Dar
ff8b88a948 Fix nightly torch CI (#40469)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-08-26 22:02:15 +02:00
Yih-Dar
74ad608a2b Not to shock AMD team by the cancelled workflow run notification ❤️ 💖 (#40467) 2025-08-26 20:53:24 +02:00
SowmiyaNarayanan G
c8c7623f20 Update SegFormer model card (#40417)
* Update SegFormer model card

* Update docs/source/en/model_doc/segformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/segformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/segformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/segformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/segformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/segformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/segformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update the segformer model card

* Remove quantization example

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-08-26 08:27:25 -07:00
StevenBucaille
78f32c3917 [pipeline] Add Keypoint Matching pipeline (#39970)
* feat: keypoint-matcher pipeline

* docs: added keypoint-matcher pipeline in docs

* fix: added missing statements for repo consistency

* docs: updated SuperGlue, LightGlue and EfficientLoFTR docs

* Apply suggestions from code review

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* test: fixed run_pipeline_test

* update pipeline typing and docs

* update tests

* update docs snippets

* Fix import error

* fix: pipeline init

* pt framework

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-08-26 15:26:57 +01:00
Joao Gante
6451294f6f [RoPE] explicit factor > implicit factor in YaRN (#40320)
explicit factor > implicit factor
2025-08-26 14:58:28 +01:00
audioXD
5a8ba87ecf [fast_image_processor] fix image normalization for resize (#40436) 2025-08-26 13:49:51 +00:00
VED
0ce6709e70 deci gguf support (#38669)
* deci gguf support

* make style

* tests for deci

* try except removed

* style

* try except removed
2025-08-26 13:43:17 +00:00
Matt
263d06fedc Fix extra template loading (#40455)
* Fix extra template loading

* Reformat

* Trigger tests
2025-08-26 14:01:01 +01:00
Pedro Cuenca
58cebc848b flash_paged: s_aux may not exist (#40434)
Some implementations (i.e.,
https://huggingface.co/kernels-community/vllm-flash-attn3) support an
`s_aux` arg for attention sinks, but others
(https://huggingface.co/kernels-community/flash-attn) do not. If s_aux
is present in the kwargs, we forward it, otherwise we don't.

The user will still get an error if they use a model like gpt-oss-20b
with an implementation that does not support `s_aux`, but models that
don't use it won't error out. For example, [this is currently
failing](399cd5c04b/examples/pytorch/continuous_batching.py (L16))
because we are sending `s_aux: None` in the dict.
2025-08-26 13:15:59 +02:00
Rémi Ouazan
34108a2230 Continuous batching refactor (#40426)
* Rework of the CB example

* Further rework of CB example

* Refactor PA cache, slice on tokens, add debug prints -- WIP

* Slice cache -- WIP

* Added a mechanism to check batched outputs in CB script

* Less logging, debug flag for slice, !better reset! -- WIP

* QOL and safety margins

* Refactor and style

* Better saving of cb example

* Fix

* Fixes and QOL

* Mor einformations about metrics

* Further logging

* Style

* Licenses

* Removed some comments

* Add a slice input flag

* Fix in example

* Added back some open-telemetry deps

* Removed some aux function

* Added FA2 option to example script

* Fixed math (all of it)

* Added a simple example

* Renamed core to classes

* Made allocation of attention mask optionnal

* Style
2025-08-26 13:01:42 +02:00
Manuel de Prada Corral
49e168ff08 🚨 Remove Contrastive Search decoding strategy (#40428)
* delete go brrr

* fix tests

* review
2025-08-26 12:31:46 +02:00
Rémi Ouazan
b8184b7ce9 Make cache_config not mandatory (#40316)
* Relaxed assumptions on cache_config

* Review compliance

* Style

* Styyyle

* Removed default and added args

* Rebase mishapfix

* Propagate args to TorchExportableModuleForDecoderOnlyLM

* Fix the test I wanted  fixed in this PR

* Added some AMD expectation related to cache tests
2025-08-26 12:06:17 +02:00
Yao Matrix
32fcc24667 rename get_cuda_warm_up_factor to get_accelerator_warm_up_factor (#40363)
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-08-26 09:56:35 +00:00
Raushan Turganbay
f690a2a1e0 [video processors] decode only sampled videos -> less RAM and faster processing (#39600)
* draft update two models for now

* batch update all VLMs first

* update some more image processors

* update

* fix a few tests

* just make CI green for now

* fix copies

* update once more

* update

* unskip the test

* fix these two

* fix torchcodec audio loading

* maybe

* yay, i fixed torchcodec installation and now can actually test it

* fix copies deepseek

* make sure the metadata is returrned when users request it

* add docs

* update

* fixup

* Update src/transformers/audio_utils.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/models/glm4v/video_processing_glm4v.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* update

* what if we set some metadata attr to `None`

* fix CI

* fix one test

* fix 4 channel test

* fix glm timestemps

* rebase gone wrong

* raise warning once

* fixup

* typo

* fix copies

* ifx smolvlm test

* this is why torch's official benchmark was faster, set threads to `0`

* Apply style fixes

---------

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-08-26 11:38:02 +02:00
Xin Yao
64ae6e6b1d fix qwen25-vl grad acc (#40333)
* fix qwen25—vl grad acc

* fix Qwen2_5_VLForConditionalGeneration for accepts_loss_kwargs

* fix ci

* fix ci

* fix typo

* fix CI
2025-08-26 09:30:06 +00:00
Kashif Rasul
6d2bb1e04d [Trainer] accelerate contextparallel support in trainer (#40205)
* initial context_parallel_size support in trainer

* For context parallelism, use AVG instead of SUM to avoid over-accounting tokens

* use parallelism_config.cp_enabled

* add parallelism_config to trainer state

* warn when auto-enabling FSDP

* fix some reviews

* WIP: somewhat matching loss

* Feat: add back nested_gather

* Feat: cleanup

* Fix: raise on non-sdpa attn

* remove context_parallel_size from TrainingArguments

* if we have parallelism_config, we defer to get_state_dict from accelerate

* fix form review

* Feat: add parallelism config support

* Chore: revert some unwanted formatting changes

* Fix: check None

* Check none 2

* Fix: remove duplicate import

* Update src/transformers/trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Fin

* require accerelate 1.10.1 and higer

---------

Co-authored-by: S1ro1 <matej.sirovatka@gmail.com>
Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-08-26 09:28:48 +00:00
Pavel Iakubovskii
63caaea1fb Refactor ViT-like models (#39816)
* refactor vit

* fix

* fixup

* turn off FX tests

* AST

* deit

* dinov2

* dinov2_with_registers

* dpt

* depth anything (nit)

* depth pro (nit)

* ijepa

* ijepa (modular)

* prompt_depth_anything (nit)

* vilt (nit)

* zoedepth (nit)

* videomae

* vit_mae

* vit_msn

* vivit

* yolos

* eomt

* vitpose

* update auto backbone

* disable `fx` and export tests (dnov2, dpt, ijepa, vit, vitpose)

* fix kwargs for backbone

* fix

* convnext

* fixup

* update convnext layernorm

* fix-copies layer_norm

* convnextv2

* explicit output_hidden_states for models with backbones

* explicit hidden states collection for dinov2

* tests fixed

* fix DPT as well

* fix dinov2 with registers

* add comment
2025-08-26 11:14:06 +02:00
Yih-Dar
922e65b3fc Fix non FA2 tests after FA2 installed in CI docker image (#40430)
* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-08-26 10:36:50 +02:00
ssum21
795ae8f282 docs : 4N3MONE recommandced modified contents 2025-08-09 20:07:42 -07:00
ssum21
bdba1f83a8 fix: glossary edits 2025-07-25 11:06:11 -07:00
ssum21
61eb8b32cc fix: manual edits 2025-07-24 16:07:28 -07:00
ssum21
4d297c2e8c feat: nmt draft 2025-07-24 15:57:31 -07:00
ssum21
0dc80fcdad docs: ko: deepseek_v3.md 2025-07-24 15:54:56 -07:00
4392 changed files with 5574 additions and 6550 deletions

BIN
._.DS_Store Normal file

Binary file not shown.

BIN
._.circleci Normal file

Binary file not shown.

BIN
._.git Normal file

Binary file not shown.

BIN
._.gitattributes Normal file

Binary file not shown.

BIN
._.github Normal file

Binary file not shown.

BIN
._.gitignore Normal file

Binary file not shown.

BIN
._AGENTS.md Normal file

Binary file not shown.

BIN
._CITATION.cff Normal file

Binary file not shown.

BIN
._CODE_OF_CONDUCT.md Normal file

Binary file not shown.

BIN
._ISSUES.md Normal file

Binary file not shown.

BIN
._LICENSE Normal file

Binary file not shown.

BIN
._awesome-transformers.md Normal file

Binary file not shown.

BIN
._benchmark Normal file

Binary file not shown.

BIN
._docker Normal file

Binary file not shown.

BIN
._docs Normal file

Binary file not shown.

BIN
._examples Normal file

Binary file not shown.

BIN
._i18n Normal file

Binary file not shown.

BIN
._notebooks Normal file

Binary file not shown.

BIN
._scripts Normal file

Binary file not shown.

BIN
._src Normal file

Binary file not shown.

BIN
._templates Normal file

Binary file not shown.

BIN
._tests Normal file

Binary file not shown.

BIN
._utils Normal file

Binary file not shown.

BIN
.circleci/._TROUBLESHOOT.md Normal file

Binary file not shown.

BIN
.circleci/._config.yml Normal file

Binary file not shown.

Binary file not shown.

11
.gitattributes vendored
View File

@@ -1,4 +1,13 @@
*.py eol=lf
*.rst eol=lf
*.md eol=lf
*.mdx eol=lf
*.mdx eol=lf
*.model filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text
*.jpg filter=lfs diff=lfs merge=lfs -text
*.jpeg filter=lfs diff=lfs merge=lfs -text
*.gif filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text

BIN
.github/._ISSUE_TEMPLATE vendored Normal file

Binary file not shown.

BIN
.github/._PULL_REQUEST_TEMPLATE.md vendored Normal file

Binary file not shown.

BIN
.github/._conda vendored Normal file

Binary file not shown.

BIN
.github/._scripts vendored Normal file

Binary file not shown.

BIN
.github/._workflows vendored Normal file

Binary file not shown.

BIN
.github/ISSUE_TEMPLATE/._bug-report.yml vendored Normal file

Binary file not shown.

BIN
.github/ISSUE_TEMPLATE/._config.yml vendored Normal file

Binary file not shown.

Binary file not shown.

BIN
.github/ISSUE_TEMPLATE/._i18n.md vendored Normal file

Binary file not shown.

BIN
.github/ISSUE_TEMPLATE/._migration.yml vendored Normal file

Binary file not shown.

Binary file not shown.

BIN
.github/conda/._build.sh vendored Normal file

Binary file not shown.

BIN
.github/conda/._meta.yaml vendored Normal file

Binary file not shown.

BIN
.github/scripts/._assign_reviewers.py vendored Normal file

Binary file not shown.

Binary file not shown.

BIN
.github/workflows/._TROUBLESHOOT.md vendored Normal file

Binary file not shown.

BIN
.github/workflows/._add-model-like.yml vendored Normal file

Binary file not shown.

BIN
.github/workflows/._assign-reviewers.yml vendored Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
.github/workflows/._get-pr-info.yml vendored Normal file

Binary file not shown.

BIN
.github/workflows/._get-pr-number.yml vendored Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
.github/workflows/._pr-style-bot.yml vendored Normal file

Binary file not shown.

Binary file not shown.

BIN
.github/workflows/._release-conda.yml vendored Normal file

Binary file not shown.

Binary file not shown.

BIN
.github/workflows/._self-past-caller.yml vendored Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
.github/workflows/._self-push-amd.yml vendored Normal file

Binary file not shown.

BIN
.github/workflows/._self-push-caller.yml vendored Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
.github/workflows/._ssh-runner.yml vendored Normal file

Binary file not shown.

BIN
.github/workflows/._stale.yml vendored Normal file

Binary file not shown.

BIN
.github/workflows/._trufflehog.yml vendored Normal file

Binary file not shown.

BIN
.github/workflows/._update_metdata.yml vendored Normal file

Binary file not shown.

Binary file not shown.

View File

@@ -12,12 +12,34 @@ on:
branches:
- run_ci_with_nightly_torch*
# Used for `push` to easily modify the target workflow runs to compare against
env:
prev_workflow_run_id: ""
other_workflow_run_id: ""
jobs:
build_nightly_torch_ci_images:
name: Build CI Docker Images with nightly torch
uses: ./.github/workflows/build-nightly-ci-docker-images.yml
secrets: inherit
setup:
name: Setup
runs-on: ubuntu-22.04
steps:
- name: Setup
run: |
mkdir "setup_values"
echo "${{ inputs.prev_workflow_run_id || env.prev_workflow_run_id }}" > "setup_values/prev_workflow_run_id.txt"
echo "${{ inputs.other_workflow_run_id || env.other_workflow_run_id }}" > "setup_values/other_workflow_run_id.txt"
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: setup_values
path: setup_values
model-ci:
name: Model CI
needs: build_nightly_torch_ci_images

View File

@@ -36,7 +36,7 @@ jobs:
send_results:
name: Send results to webhook
runs-on: ubuntu-22.04
if: always()
if: always() && !cancelled()
steps:
- name: Preliminary job status
shell: bash

BIN
benchmark/._README.md Normal file

Binary file not shown.

BIN
benchmark/.___init__.py Normal file

Binary file not shown.

BIN
benchmark/._benchmark.py Normal file

Binary file not shown.

BIN
benchmark/._config Normal file

Binary file not shown.

BIN
benchmark/._default.yml Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
docker/._README.md Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
docker/._transformers-gpu Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -32,7 +32,10 @@ RUN python3 -m pip uninstall -y flax jax
RUN python3 -m pip install --no-cache-dir -U timm
RUN python3 -m pip install --no-cache-dir git+https://github.com/facebookresearch/detectron2.git pytesseract
RUN [ "$PYTORCH" != "pre" ] && python3 -m pip install --no-cache-dir git+https://github.com/facebookresearch/detectron2.git || echo "Don't install detectron2 with nightly torch"
RUN python3 -m pip install --no-cache-dir pytesseract
RUN python3 -m pip install -U "itsdangerous<2.1.0"
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
@@ -52,7 +55,7 @@ RUN python3 -m pip install --no-cache-dir bitsandbytes
RUN python3 -m pip install --no-cache-dir quanto
# After using A10 as CI runner, let's run FA2 tests
RUN python3 -m pip uninstall -y ninja && python3 -m pip install --no-cache-dir ninja && python3 -m pip install flash-attn --no-cache-dir --no-build-isolation
RUN [ "$PYTORCH" != "pre" ] && python3 -m pip uninstall -y ninja && python3 -m pip install --no-cache-dir ninja && python3 -m pip install flash-attn --no-cache-dir --no-build-isolation || echo "Don't install FA2 with nightly torch"
# TODO (ydshieh): check this again
# `quanto` will install `ninja` which leads to many `CUDA error: an illegal memory access ...` in some model tests

Binary file not shown.

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More