HuggingFace_transformer

Author	SHA1	Message	Date
Yih-Dar	e55983e2b9	Fix `aya_vision` test (#38674 ) * fix 1: load_in_4bit=True, * fix 2: decorateor * fixfix 2: breakpoint * fixfix 3: update * fixfix 4: fast * fixfix 5: cond * fixfix 5: cond * fixfix 6: cuda 8 * ruff * breakpoint * dtype * a10 * a10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-09 22:18:52 +02:00
Aashish Anand	b61c47f5a5	Created model card for xlm-roberta-xl (#38597 ) * Created model card for xlm-roberta-xl * Update XLM-RoBERTa-XL model card with improved descriptions and usage examples * Minor option labeling fix * Added MaskedLM version of XLM RoBERTa XL to model card * Added quantization example for XLM RoBERTa XL model card * minor fixes to xlm roberta xl model card * Minor fixes to mask format in xlm roberta xl model card	2025-06-09 13:00:38 -07:00
Aashish Anand	e594e75f1b	Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout (#38596 ) * Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout * Added CLI command example and quantization example for XLM RoBERTa model card. * Minor change to transformers CLI and quantization example for XLM roberta model card	2025-06-09 12:26:31 -07:00
Aashish Anand	29ca043856	Created model card for XLM model (#38595 ) * Created model card for XLM model * Revised model card structure and content of XLM model * Update XLM model documentation with improved examples and code snippets for predicting <mask> tokens using Pipeline and AutoModel.	2025-06-09 12:26:23 -07:00
Marcel Ambo Ndowah	25f711aa89	Drop as_target_processor from the _call_ and pad methods (#38642 ) Drop as_target_processor from _call_ and pad methods; reformat docstrings for readability	2025-06-09 12:26:09 -07:00
Matthew Douglas	837ddac1ec	Docs: update bitsandbytes torch.compile compatibility (#38651 )	2025-06-09 14:51:57 -04:00
dbleyl	b9faf2f930	Fix TypeError: 'NoneType' object is not iterable for esm (#38667 ) (#38668 ) Add post_init() calls to EsmForMaskedLM, EsmForTokenClassification and EsmForSequenceClassification.	2025-06-09 15:23:20 +00:00
Fiona Waters	11dca07a10	Fix retrieve function signature and remove faiss requirement (#38624 ) Signed-off-by: Fiona Waters <fiwaters6@gmail.com>	2025-06-09 15:17:33 +00:00
xiao	b31d462c61	Fix some models import (#38694 ) Fix models import	2025-06-09 16:09:24 +01:00
pweglik	282d6684dc	Fix attention mask expansion when converting to executorch (#38637 )	2025-06-09 15:00:55 +00:00
Anthony	19224c3642	fix: "check out" as verb (#38678 ) "check out" as verb	2025-06-09 14:07:31 +00:00
StevenBucaille	237ff80387	Fixed modeling_auto.py MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variable (#38664 ) fix: grouped the two MODEL_FOR_MASK_GENERATION_MAPPING_NAMES variables	2025-06-09 13:40:46 +00:00
Isotr0py	d7b87b415a	Fix qwen2-audio chat template audio placeholder insertion (#38640 ) * fix qwen2-audio template Signed-off-by: Isotr0py <2037008807@qq.com> * add message['type'] back Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-09 09:56:42 +00:00
Yih-Dar	10627c1a0f	Use torch 2.7.1 on daily CI (#38620 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-08 14:37:45 +02:00
Yih-Dar	ebeec13609	Fix `InternVL` integration test (#38612 ) * fix * fix * fix OOM --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-07 08:30:47 +02:00
Yih-Dar	3fb7e7bc01	Skip torchscript tests for 2 models (#38643 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 20:17:37 +02:00
Yao Matrix	dc76eff12b	remove ipex_optimize_model usage (#38632 ) * remove ipex_optimize_model usage Signed-off-by: YAO Matrix <matrix.yao@intel.com> * update Dockerfile Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Signed-off-by: root <root@a4bf01945cfe.jf.intel.com> Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>	2025-06-06 20:04:44 +02:00
Yih-Dar	5009252a05	Better CI (#38552 ) better CI Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 17:59:14 +02:00
jiqing-feng	2e889c18e1	fix torch_dtype on awq (#38463 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 17:14:00 +02:00
inkcherry	871901cb3d	fix total batch size calculation in trainer (#38286 ) * fix total batch size calculation * update Signed-off-by: inkcherry <mingzhi.liu@intel.com> * Update src/transformers/trainer.py --------- Signed-off-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 14:54:00 +00:00
Yih-Dar	02f946a038	Don't run `AriaForConditionalGenerationModelTest` on CircleCI (#38615 ) git rid of this model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 11:30:31 +02:00
Mehant Kammakomati	3d15606e64	fix: support grad clipping for TP through replicating non-sharded modules (#36132 ) * feat: fix tp grad norm: Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> * feat: use implicit replication Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> --------- Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-06 11:07:22 +02:00
Yih-Dar	fca6748246	Improve `test_initialization` for `SwiftFormer` (#38636 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:47:10 +02:00
Yih-Dar	92a87134ea	update `ColQwen2ModelIntegrationTest` (#38583 ) * update * update * update * update * 4 bit * 8 bit * final --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:41:17 +02:00
Raushan Turganbay	dbfc79c17c	[generation] bring back tests on vision models (#38603 ) * bring back geenration tests on VLMs * remove head mask tests overwritten	2025-06-06 08:23:15 +00:00
Yih-Dar	90c4b90a10	Use torch 2.7.1 on CircleCI jobs (#37856 ) 2.7.1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:16:57 +02:00
Yih-Dar	3e35ea1782	Improve `test_initialization` (#38607 ) * fix flaky init tests * fix flaky init tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-06 10:08:05 +02:00
Yao Matrix	89542fb81c	enable more test cases on xpu (#38572 ) * enable glm4 integration cases on XPU, set xpu expectation for blip2 Signed-off-by: Matrix YAO <matrix.yao@intel.com> * more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * refine wording Signed-off-by: YAO Matrix <matrix.yao@intel.com> * refine test case names Signed-off-by: YAO Matrix <matrix.yao@intel.com> * run Signed-off-by: YAO Matrix <matrix.yao@intel.com> * add gemma2 and chameleon Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix review comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Matrix YAO <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-06-06 09:29:51 +02:00
Armaghan Shakir	31023b6909	Fix `MiniMax` (docs and integration tests checkpoint) (#38575 ) * update checkpoints for integration tests * minor fixes in docs	2025-06-06 08:43:11 +02:00
Vanshu	593e29c5e2	Updated Aria model card (#38472 ) * Update aria.md * Update aria.md * Suggested Updates - aria.md	2025-06-05 14:36:54 -07:00
Parag Ekbote	77cf4936fe	[Nit] Add Note on SigOpt being in Public Archive Mode (#38610 ) * add note on sigopt * update * Update docs/source/en/hpo_train.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-05 14:07:23 -07:00
Monish Singhal	c75bf2c36e	Fix typo in LLaVa documentation (#38618 ) * Fix typo in LLaVa documentation In exactly one section, LlavaImageProcessor was spelt wrongly as LLavaImageProcessor, which throws off copy-pasting the section. * Fix LlavaImageProcessor url to make it valid (and copypaste-able) Earlier, the URL contained the entire HF prefix. This commit removes that to ensure that the code block can be copied and run as is.	2025-06-05 13:25:07 -07:00
johncaged	5399c1d670	docs: fix dark mode logo display. (#38586 )	2025-06-05 13:06:59 -07:00
Yih-Dar	481b953170	Fix `return_dict=False` giving errors in a few VLM models (#38519 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-05 21:19:07 +02:00
Sai-Suraj-27	88912b8e95	Remove `isort` from dependencies (#38616 ) Removed isort as a dependency	2025-06-05 16:42:49 +00:00
David Klank	fa921ad854	fix spelling errors (#38608 ) * fix errors test_modeling_mllama.py * fix error test_modeling_video_llava.py * fix errors test_processing_common.py	2025-06-05 13:57:23 +01:00
Isotr0py	0f833528c9	Avoid overwrite existing local implementation when loading remote custom model (#38474 ) * avoid overwrite existing local implementation when loading custom remote model Signed-off-by: Isotr0py <2037008807@qq.com> * update comments Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-05 13:54:40 +01:00
KameniAlexNea	8f630651b0	Allow `mlm_probability` to be set to `None` when `mlm=False` in DataCollatorForLanguageModeling (#38522 ) (#38537 ) * mlm_probability in DataCollatorForLanguageModeling should be validated only when mlm is True (#38522) * Change mlm_probability to Optional in DataCollatorForLanguageModeling (#38537) --------- Co-authored-by: eak <eak@ivalua.com>	2025-06-05 13:54:12 +01:00
dependabot[bot]	65f5fa71cd	Bump torch from 2.6.0 to 2.7.1 in /examples/flax/vision (#38606 ) Bumps [torch](https://github.com/pytorch/pytorch) from 2.6.0 to 2.7.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v2.6.0...v2.7.1) --- updated-dependencies: - dependency-name: torch dependency-version: 2.7.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-05 13:38:02 +01:00
Yih-Dar	8c59cdb3f8	pin pandas (#38605 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-05 11:33:06 +02:00
Yih-Dar	8cfcfe58c0	Remove custom pytest and pluggy (#38589 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-05 10:23:40 +02:00
Raushan Turganbay	0d69fa6dcd	[qwen-omni] fix sliding window (#38525 ) fix	2025-06-05 10:11:58 +02:00
Henrik Matthiesen	1fed6166c0	added fast image processor for ZoeDepth and expanded tests accordingly (#38515 ) * added fast image processor for ZoeDepth and expanded tests accordingly * added fast image processor for ZoeDepth and expanded tests accordingly, hopefully fixed repo consistency issue too now * final edits for zoedept fast image processor * final minor edit for zoedepth fast imate procesor	2025-06-04 22:59:17 +00:00
Sai-Suraj-27	a510be20f3	Updated deprecated typing imports with equivalents for Python 3.9+ (#38546 ) * Replace deprecated typing imports with collections.abc equivalents for Python 3.9+ * Fixed code quality --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-04 16:57:23 +00:00
RogerSinghChugh	8e1266de2b	New gpt neo model card (#38505 ) * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Updated BERTweet model card. * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/bertweet.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * updated toctree (EN). * Commit for new_gpt_model_card. * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/gpt_neo.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-04 09:56:47 -07:00
Dmitry Rogozhkin	8046aff520	tests/roformer: fix couple roformer tests on gpus (#38570 ) Fix "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu" error running the following roformer tests on GPUs (CUDA or XPU): ``` tests/models/roformer/test_modeling_roformer.py::RoFormerSinusoidalPositionalEmbeddingTest::test_basic tests/models/roformer/test_modeling_roformer.py::RoFormerSelfAttentionRotaryPositionEmbeddingTest::test_apply_rotary_position_embeddings ``` Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>	2025-06-04 18:45:56 +02:00
Aryan Chauhan	b9c17c5dc0	[Dinov2] Enable device_map="auto" support (#38487 ) * Fix: resolve import order and duplicate import (ruff I001, F811) * Format: clean up Dinov2 test file with ruff formatter * Add _no_split_modules = ['Dinov2Layer'] to enable device_map='auto' * Revert dinov2_with_registers _no_split_modules to original state * Remove redundant device_map test as suggested * Remove unused import after deleting test * removed import torch and the redundant test function * Update tests/models/dinov2/test_modeling_dinov2.py --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-06-04 15:42:40 +00:00
Luc Georges	ae3733f06e	feat: add `repository` field to benchmarks table (#38582 ) * feat: add `repository` field to benchmarks table * fix: remove unwanted `,`	2025-06-04 15:40:52 +02:00
Manal ML	1285aec4cc	Docs: fix code formatting in torchao docs (#38504 )	2025-06-04 12:35:21 +00:00
Minho Ryu	6c5d4b1dd2	allow custom head_dim for qwen2_moe (#37188 ) allow custom head_dim Co-authored-by: ryan.agile <ryan.agile@kakaobrain.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-06-04 12:27:30 +00:00

1 2 3 4 5 ...

19237 Commits