Commit Graph

19883 Commits

Author SHA1 Message Date
Yih-Dar
0d511f7a77 Use comment to build doc on PRs (#39846)
* try

* try

* try

* try

* try

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-08-04 10:24:45 +02:00
Quentin Gallouédec
4819adbbaa Refactor label name handling for PEFT models in Trainer class (#39265)
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-08-04 06:29:57 +00:00
Quentin Gallouédec
166fcad3f8 Improve is_wandb_available function to verify WandB installation (#39875)
Improve `is_wandb_available` function to verify WandB installation by checking for a key attribute
2025-08-04 08:22:52 +02:00
Arthur
6dfd561d9c remove dtensors, not explicit (#39840)
* remove dtensors, not explicit

Co-authored-by: 3outeille <3outeille@users.noreply.github.com>

* style

* fix test

* update

* as we broke saving try to fix

* output layouts should exit

* nit

* devicemesh exists if it was distributed

* use _device_mesh of self

* update

* lol

* fix

* nit

* update

* fix!

* this???

* grumble grumble

* ?

* fuck me

---------

Co-authored-by: 3outeille <3outeille@users.noreply.github.com>
2025-08-01 22:02:47 +02:00
Quentin Gallouédec
b727c2b20e Allow TrackioCallback to work when pynvml is not installed (#39851)
Allow TrackioCallback to work when pynvml is not installed
2025-08-01 18:57:25 +02:00
StevenBucaille
1ec0feccdd [image-processing] deprecate plot_keypoint_matching, make visualize_keypoint_matching as a standard (#39830)
* fix: deprecate plot_keypoint_matching and make visualize_keypoint_matching for all Keypoint Matching models

* refactor: added copied from

* fix: make style

* fix: repo consistency

* fix: make style

* docs: added missing method in SuperGlue docs
2025-08-01 16:29:57 +00:00
Yoni Gozlan
7b4d9843ba Add fast image processor Janus, Deepseek VL, Deepseek VL hybrid (#39739)
* add fast image processor Janus, deepseek_vl, deepseek_vl_hybrid

* fix after review
2025-08-01 12:20:08 -04:00
Lysandre Debut
88ead3f518 Fix responses add tests (#39848)
* Quick responses fix

* [serve] Fix responses API and add tests

* Remove typo

* Remove typo

* Tests
2025-08-01 18:06:08 +02:00
Arthur
6ea646a03a Update ux cb (#39845)
* clenaup

* nits

* updates

* fix logging

* push updates?

* just passexception

* update

* nits

* fix

* add tokencount

* style
2025-08-01 16:50:28 +02:00
rziga
3951d4ad5d Add MM Grounding DINO (#37925)
* first commit

Added modular implementation for MM Grounding DINO from starting point created by add-new-model-like. Added conversion script from mmdetection to huggingface.

TODO: Some tests are failing so that needs to be fixed.

* fixed a bug with modular definition of MMGroundingDinoForObjectDetection where box and class heads were not correctly assigned to inner model

* cleaned up a hack in the conversion script

* Fixed the expected values in integration tests

Cross att masking and cpu-gpu consistency tests are still failing however.

* changes for make style and quality

* add documentation

* clean up contrastive embedding

* add mm grounding dino to loss mapping

* add model link to config docstring

* hack fix for mm grounding dino consistency tests

* add special cases for unused config attr check

* add all models and update docs

* update model doc to the new style

* Use super_kwargs for modular config

* Move init to the _init_weights function

* Add copied from for tests

* fixup

* update typehints

* Fix-copies for tests

* fix-copies

* Fix init test

* fix snippets in docs

* fix consistency

* fix consistency

* update conversion script

* fix nits in readme and remove old comments from conversion script

* add license

* remove unused config args

* remove unnecessary if/else in model init

* fix quality

* Update references

* fix test

* fixup

---------

Co-authored-by: qubvel <qubvel@gmail.com>
2025-08-01 15:43:23 +01:00
Yuanyuan Chen
50145474b7 [typecheck] proper export of private symbols (#39729)
* Export private symbols

Signed-off-by: cyy <cyyever@outlook.com>

* Update src/transformers/__init__.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Update src/transformers/__init__.py

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>

* Fix format

Signed-off-by: cyy <cyyever@outlook.com>

* Add a comment for exported symbols

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-08-01 13:36:47 +01:00
Arthur
c962f1515e [attn_implementation] remove recursive, allows custom kernels with wrappers (#39823)
* fix?

* fixme and style

* Update src/transformers/modeling_utils.py

* update

* update

* fix

* small fixees

* nit

* nits

* fix init check?

* fix

* fix default

* or fucks me

* nits

* include a small nit

* does this make it hapy?

* fixup

* fix the remaining ones
2025-08-01 12:18:28 +02:00
Raushan Turganbay
d3b8627b56 [VLMs] split out "get placeholder mask" to helper (#39777)
* batch upidate all models

* update

* forgot about llava onevision

* update

* fix tests

* delete file

* typo

* fix emu3 once and forever

* update cohere2 vision as well
2025-08-01 08:01:06 +00:00
Arthur
a115b67392 Fix tp cb (#39838)
* fixes

* one more
2025-08-01 09:59:04 +02:00
Eric Bezzam
2c0af41ce5 Fix bad markdown links (#39819)
Fix bad markdown links.
2025-07-31 09:14:14 -07:00
Tommy Chiang
4fcf455517 Fix broken links (#39809)
Replace links in the form of `[text]((url))` to `[text](url)`. This is
the correct format of a url in the markdown.
2025-07-31 13:23:04 +00:00
Raushan Turganbay
b937d47455 [cohere2 vision] move doc to multimodal section (#39820)
move doc to multimodal section
2025-07-31 15:13:02 +02:00
Kyle Duffy
6ba8a1ff45 Update documentation for Cohere2Vision models (#39817)
* Update docs with pipeline example

* Add Cohere2Vision to list of vision models

* Sort models
2025-07-31 11:58:45 +00:00
Raushan Turganbay
e1688d28d3 [Model] Cohere2 Vision (#39810)
* Add cohere2_vision to support CohereLabs/command-a-vision-07-2025

* update and add modualr file

* update processors and check with orig impl later

* delete unused files

* image processor reduce LOC and re-use GotOCR2

* update the config to use modular

* model tests pass

* processor fixes

* check model outputs decorator

* address one more comment

* Update tokens. Temp - need to read from tokenizer'

* fix for multi-gpu

* Fix image token handling

* upadte image token expansion logic

* fix a few issues with remote code loading

* not related but modular forces us to change all files now

* Add overview and code sample to cohere vision docs

* add scripts. TMP.

* Update inference script

* Create script

* set dtype in export script

* TO revert: modular export fix

* Fix scripts

* Revert "TO revert: modular export fix"

This reverts commit bdb2f305b61027a05f0032ce70d6ca698879191c.

* Use modular weights

* Upload to hub

Removed OOD weights ad script

* Updated docs

* fix import error

Update docs

Added pipeline test

* Updated docs

* Run modular script

remove modular for config

Added patch_size

Added docstrings in modular

Fix OOM

Add docs, fixup integration tests. 8-gpu passing

* tiny updates

* address comments + fixup

* add test for chat template

* check model outputs workaround

* aya vision fix check model inputs

* Revert "add test for chat template"

This reverts commit 42c756e397f588d76b449ff1f93292d8ee0202d8.

* reveert more changes

* last revert

* skip and merge

* faulty copy from

---------

Co-authored-by: Julian Mack <julian.mack@cohere.com>
Co-authored-by: kyle-cohere <kyle@cohere.com>
2025-07-31 10:57:34 +00:00
Joao Gante
6c3f27ba61 [docs] fix korean docs yet again (#39813)
fix korean docs yet again
2025-07-31 09:13:25 +00:00
Jeff Zhang
cb289ad243 feat(tokenization): add encode_message to tokenize messages one by one (#39507)
* feat(tokenization): add encode_message to tokenize messages one by one

* Fix the `encode_message` method, remove the `add_generation_prompt` parameter and add the corresponding error handling. Update the document to reflect this change and verify the error handling in the test.

* Optimize the `encode_message` method, improve the processing logic of the empty dialogue history, and ensure that the chat template can be applied correctly when the dialogue history is empty. Update the document to reflect these changes.

* The `_encode_message` method is deleted, the message coding logic is simplified, and the functional integrity of the `encode_message` method is ensured. Update the document to reflect these changes.

* Docs fix

* Revert changes in docstring of pad()

* Revert changes in docstring

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Repair the call of the `encode_message` method, update it to `encode_message_with_chat_template` to support the chat template, and adjust the relevant test cases to reflect this change.

* Optimize the call format of the `apply_chat_template` method, and merge multi-line calls into a single line to improve code readability.

---------

Co-authored-by: pco111 <15262555+pco111@user.noreply.gitee.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-07-31 10:55:45 +02:00
Joao Gante
4f93cc9174 fix: providing a tensor to cache_position in model.generate kwargs always crashes because of boolean test (#39300)
* fix: cache_position: RuntimeError: Boolean value of Tensor with more than one value is ambiguous

* test cache_position

* move test

* propagate changes

---------

Co-authored-by: Masataro Asai <guicho2.71828@gmail.com>
2025-07-30 17:30:28 +00:00
Bernhard Liebl
9b3203f47b Add callback to monitor progress in whisper transcription (#37483)
* Add callback to monitor progress in whisper transcription

* Added `` around variables, rewording

* Add example of `monitor_progress`.

---------

Co-authored-by: Eric B <ebezzam@gmail.com>
2025-07-30 17:40:53 +02:00
Drew Ross
7abb5d3992 Update mT5 model card (#39702)
* Update mt5 model card

* Fix casing of model title

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-30 08:35:04 -07:00
Arpon Kapuria
1019b00028 Update model card for Cohere2 (Command R7B) (#39604)
* Update model card for Cohere2 (Command R7B)

* fix: applied suggested changes
2025-07-30 08:34:26 -07:00
Ethan Villarosa
ecbb5ee194 standardized BARThez model card (#39701)
* standardized barthez model card according to template

* Update docs/source/en/model_doc/barthez.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/barthez.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/barthez.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/barthez.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/barthez.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/barthez.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* suggested changes to barthez model card

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-30 08:33:13 -07:00
Raushan Turganbay
8e077a3e45 Fix re-compilations for cross attention cache (#39788)
fix recompilations for cross attn cache
2025-07-30 14:52:03 +02:00
Yuanyuan Chen
1e0665a191 Simplify conditional code (#39781)
* Use !=

Signed-off-by: cyy <cyyever@outlook.com>

* Use get

Signed-off-by: cyy <cyyever@outlook.com>

* Format

* Simplify bool operations

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-07-30 12:32:10 +00:00
Yuanyuan Chen
b94929eb49 Fix an invalid condition (#39762)
Fix an invalid judgement

Signed-off-by: cyy <cyyever@outlook.com>
2025-07-30 12:19:17 +00:00
Yao Matrix
bb2ac66453 fix chameleonvision UT failure (#39646)
* fix chameleonvision UT failure

Signed-off-by: matrix.yao@intel.com <Yao Matrix>

* fix style

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

---------

Signed-off-by: matrix.yao@intel.com <Yao Matrix>
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Co-authored-by: root <Yao Matrix>
2025-07-30 12:09:26 +00:00
Raushan Turganbay
5348445dfa Super tiny update (#39727)
super tiny update
2025-07-30 12:21:41 +02:00
Yih-Dar
54cbea5615 more info in model_results.json (#39783)
more info

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-30 11:43:10 +02:00
eustlb
01d5f94695 [ASR pipline] fix with datasets 4.0 (#39504)
* fix

* handle edge case

* make
2025-07-30 08:13:40 +00:00
jiqing-feng
8ab21be570 enable static cache on vision encoder decoder (#39773)
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-07-30 08:10:46 +00:00
Cyril Vallez
67cfe11528 Fix Evolla and xLSTM tests (#39769)
* fix all evolla

* xlstm
2025-07-30 09:51:55 +02:00
Quentin Gallouédec
ec4033457e Don't set run_name when none (#39695)
* Don't set run_name when none

* revert

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-07-30 01:39:29 +00:00
Yana Mishula
551a89a4a3 Standardize CLAP model card format (#39738)
* Standardize CLAP model card format

* Apply review feedback

* Remove Resources section
2025-07-29 14:13:04 -07:00
StevenBucaille
da70b1389a docs: Update EfficientLoFTR documentation (#39620)
* docs: Update EfficientLoFTR documentation

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-29 13:54:44 -07:00
Cyril Vallez
ddd2100767 Fix OmDet test after arg deprecation (#39766)
fix arg name
2025-07-29 22:10:36 +02:00
st81
4abb053b6c Remove python3.7 reference from doc link (#39706) 2025-07-29 09:17:13 -07:00
Joao Gante
33aa49df9d [docs] Ko doc fixes after toc update (#39660)
* update docs

* doc builder working

* make fixup
2025-07-29 17:05:26 +01:00
Manuel de Prada Corral
c4e2069898 Fix Cache.max_cache_len max value for Hybrid models (#39737)
* fix gemma

* fix min

* fix quant init issue

* fix gemma 3n

* skip quant cache test

* fix modular

* new test for Gemma

* include cyril change

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-07-29 17:12:50 +02:00
Taihang Hu
075dbbceaa fix(trainer): Correct loss scaling for incomplete gradient accumulation steps (#39659)
* Fix issue[#38837]: wrong loss scaled in last step of epoch

* chore: trigger CI

* Update src/transformers/trainer.py

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

* Update src/transformers/modeling_flash_attention_utils.py

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

---------

Co-authored-by: taihang <taihang@U-2RHYVWX7-2207.local>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-07-29 17:12:31 +02:00
Jaehyeon Shin
1d061536cf 🌐 [i18n-KO] Translated how_to_hack_models.md to Korean (#39536)
* docs: ko: how_to_hack_models.md

* feat: nmt draft

* fix: manual edits
2025-07-29 08:09:16 -07:00
박종범
43fe41c0a8 🌐 [i18n-KO] Translated perf_train_gpu_one.md to Korean (#39552)
* docs: ko: perf_train_gpu_one.md

* feat: nmt draft

* fix: manual edits

* fix: Manually added missing backticks

* Update docs/source/ko/perf_train_gpu_one.md

fix: remove space between heading and GPU anchor

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update docs/source/ko/perf_train_gpu_one.md

fix: clarify table headers to indicate training speed boost and memory savings

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>

* Update docs/source/ko/perf_train_gpu_one.md

fix: improve readability

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/perf_train_gpu_one.md

fix : rephrase explanation of data preloading to improve readability

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

---------

Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
2025-07-29 08:08:57 -07:00
Ahn Joon Sung
9f38763731 🌐 [i18n-KO] Translated pipeline_gradio.md to Korean (#39520)
* docs: ko: pipeline_gradio.md

* feat: nmt draft

* fix: manual edits

* docs: ko: pipeline_gradio.md
2025-07-29 08:04:30 -07:00
Lio (임승섭)
f72311796b 🌐 [i18n-KO] Translated tokenizer.md to Korean (#39532)
* docs: ko: tokenizer.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Yijun Lee <yijun-lee@users.noreply.github.com>

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

* fix: resolve suggestions

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>

---------

Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>
2025-07-29 08:04:14 -07:00
Kim Juwon
d346d46752 🌐 [i18n-KO] Translated tvp.md to Korean (#39578)
* docs: ko: tvp.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com>

---------

Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com>
2025-07-29 08:04:00 -07:00
Ahnjj_DEV
2f59c15b33 🌐 [i18n-KO] Translated albert.md to Korean (#39524)
* docs: ko: albert.md

* feat: nmt draft

* fix: manual edits
2025-07-29 08:03:40 -07:00
Minseo Kim
98386dcee9 🌐 [i18n-KO] Translated main_classes/peft.md (#39515)
* docs: ko: main_classes/peft.md

* feat: nmt draft

* docs: add missing TOC to documentation for `PeftAdapterMixin` section

Added a table of contents (TOC) to the documentation, specifically for the `transformers.integrations.PeftAdapterMixin` section, following the structure and content outlined in [this link](https://huggingface.co/docs/transformers/main/en/main_classes/peft#transformers.integrations.PeftAdapterMixin).

* fix: Improve naturalness of purpose expression in Korean

Changed '관리하기 위한' to '관리할 수 있도록' for more natural Korean expression when describing the purpose of providing functions.

* fix: Simplify plural form and make expression more concise

Changed '~할 수 없기 때문에' to '~할 수 없어' for more concise expression while maintaining clarity.

* fix: Replace technical term '주입' with more natural '적용'

Changed '주입할 수 없어' to '적용할 수 없어' for better readability.
Considered alternatives:

'삽입': Too literal translation of 'inject'
'입력': Could be misunderstood as data input
'통합': Implies merging two systems
'추가': Simple but less precise

'적용' was chosen as it's the most natural and widely used term in Korean technical documentation for this context.

* fix: update toctree path for PEFT to lowercase

Changed the toctree path from 'PEFT' (uppercase) to 'peft' (lowercase) to match the correct directory naming convention and prevent broken links.

* docs: update as per reviewer feedback after rebase
2025-07-29 08:03:17 -07:00