Yih-Dar
e55983e2b9
Fix aya_vision test ( #38674 )
...
* fix 1: load_in_4bit=True,
* fix 2: decorateor
* fixfix 2: breakpoint
* fixfix 3: update
* fixfix 4: fast
* fixfix 5: cond
* fixfix 5: cond
* fixfix 6: cuda 8
* ruff
* breakpoint
* dtype
* a10
* a10
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-09 22:18:52 +02:00
Aashish Anand
b61c47f5a5
Created model card for xlm-roberta-xl ( #38597 )
...
* Created model card for xlm-roberta-xl
* Update XLM-RoBERTa-XL model card with improved descriptions and usage examples
* Minor option labeling fix
* Added MaskedLM version of XLM RoBERTa XL to model card
* Added quantization example for XLM RoBERTa XL model card
* minor fixes to xlm roberta xl model card
* Minor fixes to mask format in xlm roberta xl model card
2025-06-09 13:00:38 -07:00
Aashish Anand
e594e75f1b
Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout ( #38596 )
...
* Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout
* Added CLI command example and quantization example for XLM RoBERTa model card.
* Minor change to transformers CLI and quantization example for XLM roberta model card
2025-06-09 12:26:31 -07:00
Aashish Anand
29ca043856
Created model card for XLM model ( #38595 )
...
* Created model card for XLM model
* Revised model card structure and content of XLM model
* Update XLM model documentation with improved examples and code snippets for predicting <mask> tokens using Pipeline and AutoModel.
2025-06-09 12:26:23 -07:00
Marcel Ambo Ndowah
25f711aa89
Drop as_target_processor from the _call_ and pad methods ( #38642 )
...
Drop as_target_processor from _call_ and pad methods; reformat docstrings for readability
2025-06-09 12:26:09 -07:00
Matthew Douglas
837ddac1ec
Docs: update bitsandbytes torch.compile compatibility ( #38651 )
2025-06-09 14:51:57 -04:00
dbleyl
b9faf2f930
Fix TypeError: 'NoneType' object is not iterable for esm ( #38667 ) ( #38668 )
...
Add post_init() calls to EsmForMaskedLM, EsmForTokenClassification and EsmForSequenceClassification.
2025-06-09 15:23:20 +00:00
Fiona Waters
11dca07a10
Fix retrieve function signature and remove faiss requirement ( #38624 )
...
Signed-off-by: Fiona Waters <fiwaters6@gmail.com >
2025-06-09 15:17:33 +00:00
xiao
b31d462c61
Fix some models import ( #38694 )
...
Fix models import
2025-06-09 16:09:24 +01:00
pweglik
282d6684dc
Fix attention mask expansion when converting to executorch ( #38637 )
2025-06-09 15:00:55 +00:00
Anthony
19224c3642
fix: "check out" as verb ( #38678 )
...
"check out" as verb
2025-06-09 14:07:31 +00:00
StevenBucaille
237ff80387
Fixed modeling_auto.py MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variable ( #38664 )
...
fix: grouped the two MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variables
2025-06-09 13:40:46 +00:00
Isotr0py
d7b87b415a
Fix qwen2-audio chat template audio placeholder insertion ( #38640 )
...
* fix qwen2-audio template
Signed-off-by: Isotr0py <2037008807@qq.com >
* add message['type'] back
Signed-off-by: Isotr0py <2037008807@qq.com >
---------
Signed-off-by: Isotr0py <2037008807@qq.com >
2025-06-09 09:56:42 +00:00
Yih-Dar
10627c1a0f
Use torch 2.7.1 on daily CI ( #38620 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-08 14:37:45 +02:00
Yih-Dar
ebeec13609
Fix InternVL integration test ( #38612 )
...
* fix
* fix
* fix OOM
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-07 08:30:47 +02:00
Yih-Dar
3fb7e7bc01
Skip torchscript tests for 2 models ( #38643 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-06 20:17:37 +02:00
Yao Matrix
dc76eff12b
remove ipex_optimize_model usage ( #38632 )
...
* remove ipex_optimize_model usage
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* update Dockerfile
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com >
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com >
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com >
2025-06-06 20:04:44 +02:00
Yih-Dar
5009252a05
Better CI ( #38552 )
...
better CI
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-06 17:59:14 +02:00
jiqing-feng
2e889c18e1
fix torch_dtype on awq ( #38463 )
...
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-06-06 17:14:00 +02:00
inkcherry
871901cb3d
fix total batch size calculation in trainer ( #38286 )
...
* fix total batch size calculation
* update
Signed-off-by: inkcherry <mingzhi.liu@intel.com >
* Update src/transformers/trainer.py
---------
Signed-off-by: inkcherry <mingzhi.liu@intel.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-06-06 14:54:00 +00:00
Yih-Dar
02f946a038
Don't run AriaForConditionalGenerationModelTest on CircleCI ( #38615 )
...
git rid of this model
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-06 11:30:31 +02:00
Mehant Kammakomati
3d15606e64
fix: support grad clipping for TP through replicating non-sharded modules ( #36132 )
...
* feat: fix tp grad norm:
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* feat: use implicit replication
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
---------
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-06-06 11:07:22 +02:00
Yih-Dar
fca6748246
Improve test_initialization for SwiftFormer ( #38636 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-06 10:47:10 +02:00
Yih-Dar
92a87134ea
update ColQwen2ModelIntegrationTest ( #38583 )
...
* update
* update
* update
* update
* 4 bit
* 8 bit
* final
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-06 10:41:17 +02:00
Raushan Turganbay
dbfc79c17c
[generation] bring back tests on vision models ( #38603 )
...
* bring back geenration tests on VLMs
* remove head mask tests overwritten
2025-06-06 08:23:15 +00:00
Yih-Dar
90c4b90a10
Use torch 2.7.1 on CircleCI jobs ( #37856 )
...
2.7.1
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-06 10:16:57 +02:00
Yih-Dar
3e35ea1782
Improve test_initialization ( #38607 )
...
* fix flaky init tests
* fix flaky init tests
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-06 10:08:05 +02:00
Yao Matrix
89542fb81c
enable more test cases on xpu ( #38572 )
...
* enable glm4 integration cases on XPU, set xpu expectation for blip2
Signed-off-by: Matrix YAO <matrix.yao@intel.com >
* more
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* refine wording
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* refine test case names
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* run
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* add gemma2 and chameleon
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix review comments
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
---------
Signed-off-by: Matrix YAO <matrix.yao@intel.com >
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
2025-06-06 09:29:51 +02:00
Armaghan Shakir
31023b6909
Fix MiniMax (docs and integration tests checkpoint) ( #38575 )
...
* update checkpoints for integration tests
* minor fixes in docs
2025-06-06 08:43:11 +02:00
Vanshu
593e29c5e2
Updated Aria model card ( #38472 )
...
* Update aria.md
* Update aria.md
* Suggested Updates - aria.md
2025-06-05 14:36:54 -07:00
Parag Ekbote
77cf4936fe
[Nit] Add Note on SigOpt being in Public Archive Mode ( #38610 )
...
* add note on sigopt
* update
* Update docs/source/en/hpo_train.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-05 14:07:23 -07:00
Monish Singhal
c75bf2c36e
Fix typo in LLaVa documentation ( #38618 )
...
* Fix typo in LLaVa documentation
In exactly one section, LlavaImageProcessor was spelt wrongly as LLavaImageProcessor, which throws off copy-pasting the section.
* Fix LlavaImageProcessor url to make it valid (and copypaste-able)
Earlier, the URL contained the entire HF prefix. This commit removes that to ensure that the code block can be copied and run as is.
2025-06-05 13:25:07 -07:00
johncaged
5399c1d670
docs: fix dark mode logo display. ( #38586 )
2025-06-05 13:06:59 -07:00
Yih-Dar
481b953170
Fix return_dict=False giving errors in a few VLM models ( #38519 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-05 21:19:07 +02:00
Sai-Suraj-27
88912b8e95
Remove isort from dependencies ( #38616 )
...
Removed isort as a dependency
2025-06-05 16:42:49 +00:00
David Klank
fa921ad854
fix spelling errors ( #38608 )
...
* fix errors test_modeling_mllama.py
* fix error test_modeling_video_llava.py
* fix errors test_processing_common.py
2025-06-05 13:57:23 +01:00
Isotr0py
0f833528c9
Avoid overwrite existing local implementation when loading remote custom model ( #38474 )
...
* avoid overwrite existing local implementation when loading custom remote model
Signed-off-by: Isotr0py <2037008807@qq.com >
* update comments
Signed-off-by: Isotr0py <2037008807@qq.com >
---------
Signed-off-by: Isotr0py <2037008807@qq.com >
2025-06-05 13:54:40 +01:00
KameniAlexNea
8f630651b0
Allow mlm_probability to be set to None when mlm=False in DataCollatorForLanguageModeling ( #38522 ) ( #38537 )
...
* mlm_probability in DataCollatorForLanguageModeling should be validated only when mlm is True (#38522 )
* Change mlm_probability to Optional in DataCollatorForLanguageModeling (#38537 )
---------
Co-authored-by: eak <eak@ivalua.com >
2025-06-05 13:54:12 +01:00
dependabot[bot]
65f5fa71cd
Bump torch from 2.6.0 to 2.7.1 in /examples/flax/vision ( #38606 )
...
Bumps [torch](https://github.com/pytorch/pytorch ) from 2.6.0 to 2.7.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v2.6.0...v2.7.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-version: 2.7.1
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-05 13:38:02 +01:00
Yih-Dar
8c59cdb3f8
pin pandas ( #38605 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-05 11:33:06 +02:00
Yih-Dar
8cfcfe58c0
Remove custom pytest and pluggy ( #38589 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-05 10:23:40 +02:00
Raushan Turganbay
0d69fa6dcd
[qwen-omni] fix sliding window ( #38525 )
...
fix
2025-06-05 10:11:58 +02:00
Henrik Matthiesen
1fed6166c0
added fast image processor for ZoeDepth and expanded tests accordingly ( #38515 )
...
* added fast image processor for ZoeDepth and expanded tests accordingly
* added fast image processor for ZoeDepth and expanded tests accordingly, hopefully fixed repo consistency issue too now
* final edits for zoedept fast image processor
* final minor edit for zoedepth fast imate procesor
2025-06-04 22:59:17 +00:00
Sai-Suraj-27
a510be20f3
Updated deprecated typing imports with equivalents for Python 3.9+ ( #38546 )
...
* Replace deprecated typing imports with collections.abc equivalents for Python 3.9+
* Fixed code quality
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
2025-06-04 16:57:23 +00:00
RogerSinghChugh
8e1266de2b
New gpt neo model card ( #38505 )
...
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Commit for new_gpt_model_card.
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-04 09:56:47 -07:00
Dmitry Rogozhkin
8046aff520
tests/roformer: fix couple roformer tests on gpus ( #38570 )
...
Fix "RuntimeError: Expected all tensors to be on the same device,
but found at least two devices, cuda:0 and cpu" error running the
following roformer tests on GPUs (CUDA or XPU):
```
tests/models/roformer/test_modeling_roformer.py::RoFormerSinusoidalPositionalEmbeddingTest::test_basic
tests/models/roformer/test_modeling_roformer.py::RoFormerSelfAttentionRotaryPositionEmbeddingTest::test_apply_rotary_position_embeddings
```
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com >
2025-06-04 18:45:56 +02:00
Aryan Chauhan
b9c17c5dc0
[Dinov2] Enable device_map="auto" support ( #38487 )
...
* Fix: resolve import order and duplicate import (ruff I001, F811)
* Format: clean up Dinov2 test file with ruff formatter
* Add _no_split_modules = ['Dinov2Layer'] to enable device_map='auto'
* Revert dinov2_with_registers _no_split_modules to original state
* Remove redundant device_map test as suggested
* Remove unused import after deleting test
* removed import torch and the redundant test function
* Update tests/models/dinov2/test_modeling_dinov2.py
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-06-04 15:42:40 +00:00
Luc Georges
ae3733f06e
feat: add repository field to benchmarks table ( #38582 )
...
* feat: add `repository` field to benchmarks table
* fix: remove unwanted `,`
2025-06-04 15:40:52 +02:00
Manal ML
1285aec4cc
Docs: fix code formatting in torchao docs ( #38504 )
2025-06-04 12:35:21 +00:00
Minho Ryu
6c5d4b1dd2
allow custom head_dim for qwen2_moe ( #37188 )
...
allow custom head_dim
Co-authored-by: ryan.agile <ryan.agile@kakaobrain.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-06-04 12:27:30 +00:00