HuggingFace_transformer

Author	SHA1	Message	Date
Raushan Turganbay	e42681b48b	[gemma3] support sequence classification task (#39465 ) * add seq clf class * fix docs and add in auto-map * skip tests * optional pixels	2025-07-21 11:03:20 +02:00
Yoni Gozlan	541bed22d6	Improve @auto_docstring doc and rename `args_doc.py` to `auto_docstring.py` (#39439 ) * rename `args_doc.py` to `auto_docstring.py` and improve doc * modifs after review	2025-07-18 18:00:34 +00:00
Yoni Gozlan	de0dd3139d	Add fast image processor SAM (#39385 ) * add fast image processor sam * nits	2025-07-18 17:27:16 +00:00
Cyril Vallez	4ded9a4113	🚨🚨 Fix and simplify attention implementation dispatch and subconfigs handling (#39423 ) * first try * Update modeling_utils.py * Update modeling_utils.py * big refactor * Update modeling_utils.py * style * docstrings and simplify inner workings of configs * remove all trace of _internal * Update modeling_utils.py * fix logic error * Update modeling_utils.py * recursive on config * Update configuration_utils.py * fix * Update configuration_dpt.py * Update configuration_utils.py * Update configuration_utils.py * Update modeling_idefics.py * Update modeling_utils.py * fix for old models * more old models fixup * Update modeling_utils.py * Update configuration_utils.py * Remove outdated test * remove the deepcopy!! 🥵🥵 * Update test_modeling_gpt_bigcode.py * fix qwen dispatch * restrict to only models supporting it * style * switch name * Update modeling_utils.py * Update modeling_utils.py * add tests! * fix * rypo * remove bad copies * fix * Update modeling_utils.py * additional check * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * fix * skip	2025-07-18 13:41:54 +02:00
eustlb	967045082f	Add voxtral (#39429 ) * draft * draft update (conversion working) * mend * draft update * draft update: working generate * refactor * VoxtralProcessor draft * processor update * update convert_tekken_tokenizer * refactor processor * update convert * make style * better handle prefil * make style * add tests * add mistral_common audio loading * processor update * revert changes * audio utils update * add audio to apply chat template mistral update * voxtral processor update * fix * udpate converstion script * make mistral tokenier from pretrain work from local dir * fix udpates * add integration tests * add batched version * processor docstring * make style * revert convert_tekken_tokenizer changes * revert processing_qwen2.5 changes * add multi-turn test * processor improvements * address review changes * Update src/transformers/tokenization_mistral_common.py Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> * update audio utils * nits * integration test update * correct _support * update tests * test update * update integration tests * fix * fix * fix * add test_apply_chat_template_with_audio * add model doc * model doc * nit * doc uptade * nit * processor improvement * ensure default is 3B * nits * make * make * convert modular * update checkpoint * fix test * make * make * autos * make * make * nit * nit * nit --------- Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-18 00:02:04 +00:00
Joao Gante	bf6c997685	[serve] Add speech to text (`/v1/audio/transcriptions`) (#39434 ) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * use openai * validate request, including detecting unused fields * dict indexing * dict var access * tmp commit (tests failing) * add slow * use oai output type in completions * (little rebase errors) * working spec? * guard type hint * type hints. fix state (CB can now load different models) * type hints; fn names; error type * add docstrings * responses + kv cache * metadata support; fix kv cache; error event * add output_index and content_index * docstrings * add test_build_response_event * docs/comments * gate test requirements; terminate cb manager on model switch * nasty type hints * more type hints * disable validation by default; enable force models * todo * experiment: base model from typed dict * audio working * fix bad rebase * load audio with librosa * implement timed models * almost working * make fixup * fix tests * transcription request type * tokenizer -> processor * add example in docs --------- Co-authored-by: Lysandre <hi@lysand.re>	2025-07-17 14:29:57 +00:00
Dhruv Malik	787a0128a9	create ijepa modelcard (ref : PR #36979 ). (#39354 ) * wip: adding first version of the IJEPA model card. * refactor based on the @stevhliu feedbacks * refactor: - revert the accidental removal of the autodoc api description and the image reerece architecture - general context updation. * - changes of model for example quantization. - merging the quantization content.	2025-07-16 12:40:22 -07:00
ridima11	48f2233cdf	Improve grammar and clarity in perf_hardware.md (#39428 )	2025-07-16 12:15:15 -07:00
Lysandre Debut	de5ca373ac	Responses API in `transformers serve` (#39155 ) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * Responses API (to be merged into #39155) (#39338) * Scaffolding * Explicit content * Naïve Responses API streaming implementation * Cleanup * use openai * validate request, including detecting unused fields * dict indexing * dict var access * tmp commit (tests failing) * add slow * use oai output type in completions * (little rebase errors) * working spec? * guard type hint * type hints. fix state (CB can now load different models) * type hints; fn names; error type * add docstrings * responses + kv cache * metadata support; fix kv cache; error event * add output_index and content_index * docstrings * add test_build_response_event * docs/comments * gate test requirements; terminate cb manager on model switch * nasty type hints * more type hints * disable validation by default; enable force models * todo --------- Co-authored-by: Lysandre <hi@lysand.re> * Slight bugfixes * PR comments from #39338 * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2025-07-16 14:16:16 +02:00
Raushan Turganbay	c8524aeb07	[cache] make all classes cache compatible finally (#38635 ) * dump * push other models * fix simple greedy generation * xmod * add fmst and clean up some mentions of old cache format * gpt-bigcode now follows standards * delete tuple cache reference in generation * fix some models * fix some models * fix mambas and support cache in tapas * fix some more tests * fix copies * delete `_reorder_cache` * another fix copies * fix typos and delete unnecessary test * fix rag generate, needs special cache reordering * fix tapas and superglue * reformer create special cache * recurrent gemma `reorder_cache` was a no-op, delete * fix-copies * fix blio and musicgen pipeline tests * fix reformer * fix reformer, again... * delete `_supports_cache_class` * delete `supports_quantized_cache` * fix failing tests * fix copies * some minor clean up * style * style * fix copies * fix tests * fix copies * create causal mask now needs positions? * fixc copies * style * Update tests/test_modeling_common.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * clean-up of non-generative model after merging main * check `is_decoder` for cache * delete transpose for scores * remove tuple cache from docs everywhere * fix tests * fix copies * fix copies once more * properly deprecate `encoder_attention_mask` in Bert-like models * import `deprecate_kwarg` where needed * fix copies again * fix copies * delete `nex_decoder_cache` * fix copies asks to update for PLM * fix copies * rebasing had a few new models, fix them and merge asap! * fix copies once more * fix slow tests * fix tests and updare PLM checkpoint * add read token and revert accidentally removed line * oh com -on, style * just skip it, read token has no access to PLM yet --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-16 14:00:17 +02:00
Ilias Aarab	6cb43defd0	docs: add missing numpy import to minimal example (#39444 ) docs: add numpy import to minimal example	2025-07-16 11:57:13 +00:00
Marc Sun	bfc9ddf5c6	Add StableAdamW Optimizer (#39446 ) * Added StableAdamW as an optimizer option for Trainer. Also wrote tests to verify its behaviour. * Fixed issue with * Added docs for StableAdamW. Also fixed a typo in schedule free optimizers --------- Co-authored-by: Gautham Krithiwas <gauthamkrithiwas2003@gmail.com>	2025-07-16 13:35:53 +02:00
Pablo Montalvo	b9ee528246	add test scanner (#39419 ) * add test scanner * add doc + license * refactor for only 1 tree traversal * add back test of only one method * document single method scan * format * fixup generate tests * minor fix * fixup * fixup doc	2025-07-16 12:45:46 +02:00
StevenBucaille	1bc9ac5107	docs: update LightGlue docs (#39407 ) * docs: update LightGlue docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-15 12:40:50 -07:00
StevenBucaille	d9574f2fe3	docs: update SuperGlue docs (#39406 ) * docs: update SuperGlue docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-15 12:40:26 -07:00
Orion Weller	0e4b7938d0	Add ModernBERT Decoder Models - ModernBERT, but trained with CLM! (#38967 ) Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details * working locally; need to style and test * added docs and initial tests; need to debug and flesh out * fixed tests * working long context; batches * working fa2 and eager * update tests * add missing confnigs * remove default autoset * fix spacing * fix most tests * fixed tests * fix to init * refactor to match new transformers updates * remove static cache option * fa2 fix * fix docs * in progress * working on tests * fixed issue with attn outputs * remove debug * fix local config attr * update doc string * fix docstring * add docs to toc * correct typo in toc * add new updates from main w.r.t. ModernBERT RoPE * fix local param --------- Co-authored-by: oweller2 <oweller2@dsailogin.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l07.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@n02.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l08.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l01.mgmt.ai.cluster> Co-authored-by: oweller2 <oweller2@l02.mgmt.ai.cluster>	2025-07-15 10:40:41 +02:00
Tanuj Rai	8d40ca5749	Update phi4_multimodal.md (#38830 ) * Update phi4_multimodal.md * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/phi4_multimodal.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md * Update phi4_multimodal.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-14 10:35:17 -07:00
MilkClouds	3635415af2	[Docs] Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic (#39391 ) Fix typo in CustomTrainer compute_loss method and adjust loss reduction logic	2025-07-14 09:25:06 -07:00
Parag Ekbote	5c30f7e390	Update Model Card for Encoder Decoder Model (#39272 ) * update model card. * add back the model contributors for mamba and mamba2. * update the model card. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update batches with correct alignment. * update examples and remove quantization example. * update the examples. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update example. * correct the example. --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-11 11:23:08 -07:00
Xiang Chendong	0d7efe3e4b	fix gpt2 usage doc (#39351 ) fix typo of gpt2 doc usage	2025-07-11 10:59:41 -07:00
Muhammad Shaheer Malik	a646fd55fd	Updated CamemBERT model card to new standardized format (#39227 ) * Updated CamemBERT model card to new standardized format * Applied review suggestions for CamemBERT: restored API refs, added examples, badges, and attribution * Updated CamemBERT usage examples, quantization, badges, and format * Updated CamemBERT badges * Fixed CLI Section	2025-07-11 10:59:09 -07:00
Julien Denize	70e57e4710	Add mistral common support (#38906 ) * wip: correct docstrings * Add mistral-common support. * quality * wip: add requested methods * wip: fix tests * wip: add internally some methods not being supported in mistral-common * wip * wip: add opencv dependency and update test list * wip: add mistral-common to testing dependencies * wip: revert some test changes * wip: ci * wip: ci * clean * check * check * check * wip: add hf image format to apply_chat_template and return pixel_values * wip: make mistral-common non-installed safe * wip: clean zip * fix: from_pretrained * fix: path and base64 * fix: path and import root * wip: add docs * clean * clean * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-07-11 16:26:58 +00:00
Shuming Hu	bf607f6d3b	PerceptionLM (#37878 ) * plm template * A working plm with fixed image features * hacked processor * First version that reproduced PLM output using PE from timm. * Simplify and fix tie_word_embeddings * Use PIL resize. Simplify converstion. * First version that works with video input. * simplifed image preprocessing (not batched) * Minor fixes after rebasing on main. * Video processor based on new API. * Revert to use _preprocess for image processor. * refactor with modular * fix tie_word_embedding * Testing with timm PE * check in missed converstion from modular to model.py * First working version of PLM with Eva PE. PLM-1B and 3B outputs are exactly the same as before. PLM-8B output has some differences. * address review comments * Fixed batching if video and image examples mixed. * Simplify PE configuration. * Enable AutoModel for PerceptionEncoder. * Update PE config style. * update all headers * Minor fixes. * Move lm_head to PerceptionLMForConditionalGeneration. Fix vit_G model specification. * Fix for testing_modeling_perception_lm.py * Image processing refactoring to use more common parts. * Fix processor test. * update tests to use model from hub * More test fixes. * integration test GT update after rebasing; probably due to video preprocessing * update test media path to hub * Stop tracking local scripts * address some review comments * refactor image processing. * small fixes * update documentation and minor fixes * remove scripts * Minor fix for CI * Fix image processing * CI and doc fix * CI formatting fix * ruff fix * ruff formatting * ran utils/sort_auto_mappings.py * update docstring * more docstring udpates * add vision_input_type default fallback for image processing * more verbose variable naming * test update * Remove PE and PEConfig use AutoModel(TimmWrapper) instead * Minor cleanup. * Minor Fix: remove any ref to PE. Ruff format and check. * fix docstring * Fix modular/model consistency.Improvex docstringfor . * Fix PerceptionLMForConditionalGenerationModelTest * ruff fix * fix for check_repo * minor formatting * dummy size arg to fix for processor test. * Update docstring for PerceptionLMConfig * Minor fixes from review feedback. * Revert some minor changes per reviewer feedback. * update base_model_prefix * address reviewer feedback * fix comment in modeling file * address reviewer feedback * ruff format * Pre-merge test update. * reapply modular and fix checkpoint name * processor test path * use modular a bit more * remove dead code * add token decorator --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-11 11:07:32 +02:00
Giuseppe Coccia	4b47b2b8ea	Updated Switch Transformers model card with standardized format (Issue #36979 ) (#39305 ) * Updated Switch Transformers model card with standardized format (Issue #36979) * Apply reviewer suggestions to the new standardised Switch Transformer's model card * Update switch_transformers.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-10 15:34:10 -07:00
Paul Pak	9682d07f92	LFM2 (#39340 ) * [modeling][lfm2] LFM2 model on 4.53.0 interface * [configuration] hook in LFM2 keys * [modeling][lfm2] update modeling interface for 4.53.1 * [modeling][lfm2] apply mask to hidden conv states * [misc] ruff format/lint * [modeling][lfm2] minor: NotImplemented legacy cache conversion * Create lfm2.md * create nice modular * style * Update modeling_auto.py * clean and start adding tests * style * Update test_modeling_lfm2.py * Update __init__.py * small test model size * config * small fix * fix * remove useless config attrs -> block_dim and conv_dim are hiden_size * fix prepare inputs * fix config * test * typo * skip tests accordingly * config docstrings * add doc to .md * skip config docstring check --------- Co-authored-by: Maxime Labonne <81252890+mlabonne@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-10 16:07:33 +02:00
Raushan Turganbay	bc161d5d06	Delete deprecated stuff (#38838 ) * delete deprecated stuff * fix copies * remove unused tests * fix modernbert and fuyu * Update src/transformers/cache_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * bye bye `seen_tokens` * address comments * update typings * ecnoder decoder models follow same pattern as whisper * fix copies * why is it set to False? * fix switch transformers * fix encoder decoder models shared weight * fix copies and RAG * remove `next_cache` * fix gptj/git * fix copies * fix copies * style... * another forgotten docsrting --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-10 05:18:44 +00:00
Tom Aarsen	5111c8ea2f	Fix typo: langauge -> language (#39317 )	2025-07-09 12:06:46 -07:00
Priya aka Priyamvadha Balakrishnan	2781ad092d	docs: update LLaVA-NeXT model card (#38894 ) * docs: update LLaVA-NeXT model card * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/llava_next.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * [docs] Updated llava_next model card * Update docs/source/en/model_doc/llava_next.md remove image sources Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * [fix] Change Flash Attention to SDPA badge * [doc] fixed quantization example * docs: updated contribution details and badges * Update llava_next.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-09 11:32:40 -07:00
Eman Risha	d61c0d087c	Updated the Model docs - for the MARIAN model (#39138 ) * Update marian.md This update improves the Marian model card to follow the Hugging Face standardized model card format. The changes include: - Added a clear description of MarianMT, its architecture, and how it differs from other models. - Provided usage examples for Pipeline and AutoModel. - Added a quantization example for optimizing model inference. - Included instructions and examples for multilingual translation with language codes. - Added an Attention Mask Visualizer example. - Added a Resources section with relevant links to papers, the Marian framework, language codes, tokenizer guides, and quantization documentation. - Fixed formatting issues in the code blocks for correct rendering. This update improves the readability, usability, and consistency of the Marian model documentation for users. * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update marian.md * Update marian.md * Update marian.md * Update marian.md * Update docs/source/en/model_doc/marian.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update marian.md * Update marian.md * Update marian.md * Update marian.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-09 10:23:03 -07:00
MaCAT	4652677c89	🌐 [i18n-KO] Translated quark.md to Korean (#39268 ) * initial translation * removed english parts * maintain consistency * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Update docs/source/ko/quantization/quark.md Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * add toctree * fixed indentation --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>	2025-07-09 09:29:51 -07:00
Vladislav Bronzov	c980904204	Add DeepSeek V2 Model into Transformers (#36400 ) * add initial structure * doc fixes, add model base logic * update init files * some fixes to config and modular * some improvements for attention * format * remove unused attn * some fixes for moe layer and for decoder * adapt _compute_yarn_parameters for deepseek * format * small fix * fix for decoder forward * add tests, small refactoring * fix dummies * fix init * fix doc * fix config docs * add sequce doc, fix init for gate * fix issues in tests * fix config doc * remove unused args * some fixes and refactoring after review * fix doc for config * small fixes for config args * revert config refactoring * small refactoring * minor fixes after rebase * small fix after merge * fix modular * remove rotaryembd from public init * small test fix * some rotary pos calculation improvement * fix format * some improvements and fixes * fix config * some refactoring * adjust some unit tests * skip test * small fixes and tests adjustment * reapply modular * fix all tests except Integration * fix integration testzs * cleanup BC stuff * rope * fix integrations tests based on a10 * style --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-09 17:04:28 +02:00
Biao Zhang	7ef592c96c	Update T5gemma (#39210 ) * bug fix: add vocab_size to t5gemmaconfig for pipeline. * Update checkpoint placeholder * minor change * minor change * minor change: update example. * fix: add vocab_size as an explict arg. * buf fix: remove vocab_size verification; instead, re-set encoder/decoder vocab size. Note, in t5gemma, vocab size of encoder/decoder shoud be always the same. * add `add_generation_prompt` for message preprocessing.	2025-07-08 19:08:48 +02:00
Quentin Lhoest	1ecd52e50a	Add torchcodec in docstrings/tests for `datasets` 4.0 (#39156 ) * fix dataset run_object_detection * bump version * keep same dataset actually * torchcodec in docstrings and testing utils * torchcodec in dockerfiles and requirements * remove duplicate * add torchocodec to all the remaining docker files * fix tests * support torchcodec in audio classification and ASR * [commit to revert] build ci-dev images * [commit to revert] trigger circleci * [commit to revert] build ci-dev images * fix * fix modeling_hubert * backward compatible run_object_detection * revert ci trigger commits * fix mono conversion and support torch tensor as input * revert map_to_array docs + fix it * revert mono * nit in docstring * style * fix modular --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-08 17:06:12 +02:00
Joao Gante	6f1a43896c	[CI] fix docs (#39273 ) * fix docs * add ko gloassary file to toctree	2025-07-08 11:31:03 +01:00
Yaswanth Gali	fbdaa7b099	Add Aimv2 model (#36625 ) * Model skelton * changes * temp push * changes * Added support for aimv2-native * More changes * More changes * Stupid mistake correction * Added config and refactor * Added vison model * update * Refactor for lit variant * Added Text Model * Minor fixes * nits * update * Preliminary tests * More fixes * Updated tests 🤗 * Refactor * Updated testcase * Updated config * make fixup * more fixes * Bug fix and updates * deadcode * Fixes * nit * up * Happy CI ✅ * Reduce LOC * nit * nit * make style * return_dict refactor * bug fix * fix * doc update * nit * make fixup * Minor update * _init_weigths modifcation * update tests * Minor fixes post review * Update w.r.t GradientCheckpointingLayer * docs update * update * nit * Use more Modular 😉 * Change name from AIMv2 to Aimv2 * Nit * make style * Add model doc pointer * make style * Update model doc section * updates * Modify attn mask and interface * update test * Final change * Utilize flash and flex attn * keep attn mask * camelcase model name in test file * Fix docstring * Fix config warning finally and create_causal_mask * disable torchscript * remove unused arg * remove from tests * balance model size for tests * fix device * tests * tests * flaky test * fix import --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-08 11:53:21 +02:00
Jingze Shi	d8590b4b0c	Add Doge model (#35891 ) * Add Doge Model * Fix code quality * Rollback an error commit * Fix config for open-source weights * Revert "Fix config for open-source weights" This reverts commit 229cdcac10a6a4274d1dd13b729bc14c98eb0c76. * Add modular_doge * Update Doge inherits from Llama * Fix import bug * [docs] Add usage of doge model * Fix Doge import pretrainedconfig from modeling_utils to configuration_utils * [docs] remove trust remote code from doge * Fix dynamo bug in doge model * Update docstrings * Import apply_rotary_pos_emb and repeat_kv from Llama * Fix all nits * Fix code quality * Fix some bugs * Fix code quality * Remove inherited `_update_causal_mask` from Llama This leads to incorrect weight initialization. * Fix the wrong tensor orderings in DogeCDMoE * Fix attention mask bug We have to provide attention_mask for dynamic mask computation * Modify most implementations to inherit from Llama But there are two problems: 1. `flex_attention_forward` is not updated properly 2. `Example` error in the forward method of DogeForCausalLM * Modify CDMoE for batch efficient implementation * Uniform MoE configuration names, just like QwenMoE * Fix code quality * Fix code quality * Fix code quality * Add tp plan of CDMoE Module * Hybird DMA with sliding window * Update valid tokens greater than window size * Fix code quality * Add `convert_doge_weights_to_hf` * Fix STATE_DICT_MAPPING in convert_doge_weights_to_hf.py * Fix nits in modular_doge * Fix code quality * Fix all nits * Fix all nits * Make sure the attention function is updated inside the class * Fix code quality issues in the Doge model and add a test for it * Fix `test_generate` * Fix code quality * Fix nits fllowing suggestions * Fix code quality * Fix code quality issues * Fix nits * Fix code quality nits * Fix the missing parameters in the configuration. * Fix the missing parameters in the configuration. * Fix nits * Add initialization of attention * Fix last nits * Simplify dynamic mask generation logic * Rename router_logits to gate_logits for matching latest changes of MixtralModel * Rename typings for matching latest changes of MixtralModel * Fixes typo in comment * Update src/transformers/models/doge/modular_doge.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix code quality issues to match other modular * Fix code quality issues to match other modular * Fix the static compilation errors * Update model weights link * Fix code quality issues to match other modular * reapply modular and support for new outputs * style * simplify a lot * fix import location * reapply modular * fix * fix integration test --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-08 11:44:29 +02:00
gudwls215	ea3c2c0277	Fix license text, duplicate assignment, and typo in constant names (#39250 ) - Complete Apache License text in Italian documentation - Remove duplicate variable assignment in Perceiver converter - Fix typo in MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES constant	2025-07-08 10:20:52 +02:00
Yuxuan Zhang	17b3c96c00	Glm 4 doc (#39247 ) * update the glm4 model readme * update test * update GLM-4.1V model * update as format * update * fix some tests * fix the rest * fix on a10, not t4 * nit: dummy import --------- Co-authored-by: raushan <raushan@huggingface.co>	2025-07-08 08:22:04 +02:00
Drew Ross	bbca9782ca	Update LED model card (#39233 ) * Update LED model card * Remove extra arguments * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-07 15:56:57 -07:00
Mikhail Moskovchenko	3993ee1e98	Add `segmentation_maps` support to MobileNetV2ImageProcessor (#37312 ) * Add `segmentation_maps` support to mobilenet_v2 image processor and `reduce_labels` to mobilevit * Changed mobilenetv2 tests to support fastimageprocessor * added `segmentation_maps` support to fast image processor * reverted to upstream/main * Add optional * Use autodocstring * Changed docs * Docs fix * Changed fp to match beit fp * Change typing imports * Fixed repo inconsistency * Added fast-slow equivalence tests * Removed unnecessary call * Add `reduce_labels` to Mobilevit fast processor --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-07-07 13:34:59 -04:00
Joosun Hwang	9698052560	Add Korean translation for glossary.md (#38804 ) * Add Korean translation for glossary.md * Update docs/source/ko/glossary.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> * Update docs/source/ko/glossary.md Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Joosun40 <77312900+Joosun40@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>	2025-07-07 09:12:55 -07:00
Lucain	bf203aa9da	Update tiny-agents example (#39245 )	2025-07-07 15:58:36 +02:00
jiqing-feng	14cba7ad33	enable xpu on kv-cache and hqq doc (#39246 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-07-07 13:12:02 +00:00
Daniel van Strien	b8f397e456	fix typo in Gemma3n notes (#39196 )	2025-07-07 14:41:33 +02:00
Joao Gante	85d93cc6e3	[serve] Cursor support, move docs into separate page, add more examples (#39133 ) * jan docs * rm * [cursor] tmp commit * Cursor working :D * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/commands/serving.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * cursor docs * try to fix agents/tools docs? * try to fix agents/tools docs? * Update docs/source/en/serving.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * add transformers chat example with transformers serve --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2025-07-03 17:04:16 +01:00
Anton Vlasjuk	b31e9d19a6	[`Dia`] Change ckpt path in docs (#39181 ) fix ckpt path	2025-07-03 10:02:58 +00:00
Steven Liu	df12d87d18	[docs] ViTPose (#38630 ) * vitpose * fix? * fix? * feedback * fix * feedback * feedback * update sample image	2025-07-02 07:56:29 -07:00
Yaswanth Gali	b61023a1b7	🚨🚨🚨 [eomt] make EoMT compatible with pipeline (#39122 ) * Make EoMT compatible with pipeline * Implicit patch offsets * remove patch offsets from arg * Modify tests * Update example * fix proc testcase * Add few more args * add pipeline test suite * fix * docstring fixes * add pipeline test * changes w.r.t review * 🙈 MB * should fix device mismatch * debug * Fixes device mismatch * use decorator * we can split mlp * expected values update --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2025-07-02 12:25:26 +01:00
Chong You	e8e0c76162	Add activation sparsity reference in gemma3n doc (#39160 ) Add activation sparsity reference in the description of gemma3n	2025-07-02 04:11:03 +02:00
Drew Ross	fe35eca7bd	Update BigBirdPegasus model card (#39104 ) * Update igbird_pegasus.md * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-06-30 10:42:56 -07:00

1 2 3 4 5 ...

3421 Commits