HuggingFace_transformer

Author	SHA1	Message	Date
JoestarGagan	ec8a09a5fe	Feature/standardize opt model card (#39568 ) * docs: Standardize OPT model card with enhanced details * Remove incorrect link from OPT model card * Address review feedback on OPT model card * Update opt.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-23 10:57:48 -07:00
Eric Bezzam	c5a80dd6c4	🔴 Fix EnCodec internals and integration tests (#39431 ) * EnCodec fixes and update integration tests. * Apply padding mask when normalize is False. * Update comment of copied function. * Fix padding mask within modeling. * Revert padding function. * Simplify handling of padding_mask. * Address variable codebook size. * Add output for padding for consistency with original model, fix docstrings. * last_frame_pad_length as int * Update example code. * Improve docstring/comments. * Shorten expected output. * Consistent docstring. * Parameterize tests. * Properties for derived variables. * Update expected outputs from GitHub runner. * Consistent outputs with runner GPUs.	2025-07-23 19:39:27 +02:00
Eric Bezzam	7a4e2e7868	Fix DAC integration tests and checkpoint conversion. (#39313 ) * Fix DAC (slow) integration tests. * Fix DAC conversion. * Address comments * Sync with main, uncomment nn.utils.parametrizations.weight_norm. * Update DAC integration tests with expected outputs. * Added info about encoder/decoder error and longer decoder outputs. * Parameterize tests. * Set expected values to GitHub runners.	2025-07-23 19:21:26 +02:00
Eric Bezzam	596a75f6e9	Move openai import (#39613 )	2025-07-23 19:05:39 +02:00
Lysandre Debut	a0e5a7d34b	Transformers serve VLM (#39454 ) * Add support for VLMs in Transformers Serve * Raushan comments * Update src/transformers/commands/serving.py Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> * Quick fix * CPU -> Auto * Update src/transformers/commands/serving.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fixup --------- Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-23 17:03:18 +02:00
Pablo Montalvo	ea56eb6bed	Fix important models CI (#39576 ) * relax test boundaries and fix from config * eager is always supported.	2025-07-23 16:24:29 +02:00
Maxime Grenu	0fe03afeb8	Fix typos and grammar issues in documentation and code (#39598 ) - Fix Cyrillic 'Р' to Latin 'P' in Portuguese language link (README.md) - Fix 'meanginful' to 'meaningful' in training documentation - Fix duplicate 'Cohere' reference in modular transformers documentation - Fix duplicate 'the the' in trainer and chat command comments 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-07-23 12:43:11 +00:00
Matej Sirovatka	82603b6cc2	Allow `device_mesh` have multiple dim (#38949 ) * Feat: something * Feat: initial changes * tmp changes to unblock * Refactor * remove todo * Feat: docstring --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-23 12:27:36 +00:00
jiqing-feng	10c990f7e2	enable triton backend on awq xpu (#39443 ) * enable triton backend on awq xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_awq.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * fix dtype check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-23 12:10:38 +00:00
Raushan Turganbay	e7e6efcbbd	[idefics3] fix for vLLM (#39470 ) * fix idefics3 for vllm tests * fix copies	2025-07-23 14:00:43 +02:00
llbdyiu66	a62f65a989	fix moe routing_weights (#39581 ) * fix moe routing_weights * fix ernie4_5_moe routing_weights * fix integration test --------- Co-authored-by: llbdyiu66 <llbdyiu66@users.noreply.github.com> Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-07-23 11:20:23 +00:00
Andrei Panferov	623ab01039	FP-Quant support (#38696 ) * quartet * quartet qat -> quartet * format * bf16 backward * interfaces * forward_method * quartet -> fp_quant * style * List -> list * list typing * fixed format and annotations * test_fp_quant * docstrings and default dtypes * better docstring and removed noop checks * docs * pseudoquantization support to test on non-blackwell * pseudoquant * Pseudoquant docs * Update docs/source/en/quantization/fp_quant.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/quantization/fp_quant.md * Update docs/source/en/quantization/fp_quant.md * Update src/transformers/utils/quantization_config.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/fp_quant_integration/test_fp_quant.py Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> * Update tests/quantization/fp_quant_integration/test_fp_quant.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * small test fixes * dockerfile update * spec link * removed `_process_model_after_weight_loading` * toctree --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-07-23 11:41:10 +02:00
Raushan Turganbay	eb1a007f7f	Rename `supports_static_cache` to `can_compile_fullgraph` (#39505 ) * update all * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * apply suggestions * fix copies --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-23 09:35:18 +00:00
Quentin Gallouédec	b357cbb19d	[Trackio] Allow single-gpu training and monitor power (#39595 ) Allow not distributed and monitor power	2025-07-23 11:22:50 +02:00
Cyril Vallez	019b74977d	Generic task-specific base classes (#39584 ) * first shot * Update modeling_layers.py * fix mro order * finalize llama * all modular and copied from from llama * fix	2025-07-23 10:49:47 +02:00
Cyril Vallez	5dba4bc7b2	Fix DynamicCache and simplify Cache classes a bit (#39590 ) * fix * use kwargs * simplify * Update cache_utils.py * Update cache_utils.py * Update test_cache_utils.py * fix * style	2025-07-23 10:13:45 +02:00
Sangbum Daniel Choi	d9b35c635e	Mask2former & Maskformer Fast Image Processor (#35685 ) * add maskformerfast * test * revert do_reduce_labels and add testing * make style & fix-copies * add mask2former and make fix-copies TO DO: add test for mask2former * make fix-copies * fill docstring * enable mask2former fast processor * python utils/custom_init_isort.py * make fix-copies * fix PR's comments * modular file update * add license * make style * modular file * make fix-copies * merge * temp commit * finish up maskformer mask2former * remove zero shot examples --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-07-23 02:47:47 +00:00
Quentin Gallouédec	6e9972962f	🎯 Trackio integration (#38814 ) * First attempt * fix * fix * Enhance TrackioCallback to log GPU memory usage and allocation * Enhance Trackio integration in callbacks and training arguments documentation * re order * remove unused lines * fix torch optional	2025-07-22 14:50:20 -07:00
space_samurai	c6d0500d15	[WIP] Add OneformerFastImageProcessor (#38343 ) * [WIP] OneformerFastImageProcessor * update init * Fully working oneformer image processor fast * change Nearest to Neares exact interpolation where needed * fix doc --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-07-22 20:41:39 +00:00
Harry Mellor	4884b6bf41	Fix link in "Inference server backends" doc (#39589 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-22 16:44:08 +00:00
Marc Sun	075a65657a	Torchdec RuntimeError catch (#39580 ) * fix * fix * maybe better * style	2025-07-22 18:35:03 +02:00
Kashif Rasul	2936902a76	[Paged-Attention] Handle continuous batching for repetition penalty (#39457 ) * Handle continuous batching for repetition penalty * fix last scores and with token mask creation * add test * Update src/transformers/generation/continuous_batching.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/logits_process.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix formatting * remove unneeded cast --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-22 18:13:40 +02:00
Cássia Sampaio	cbcb8e6c1f	updated mistral3 model card (#39531 ) * updated mistral3 model card (#1) * updated mistral3 model card * applying suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * made all changes to mistral3.md * adding space between paragraphs in docs/source/en/model_doc/mistral3.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * removing duplicate in mistral3.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * adding 4 backticks to preserve formatting --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 09:01:55 -07:00
Woojun Jung	601260fd96	Update `docs/source/ko/_toctree.yml` (#39516 ) docs: update `docs/source/ko/_toctree.yml`	2025-07-22 09:00:42 -07:00
Manuel de Prada Corral	c338fd43b0	[cache refactor] Move all the caching logic to a per-layer approach (#39106 ) * Squash for refactor: Replace monolithic cache classes with modular LayeredCache (#38077) - Introduces CacheLayer and Cache base classes - Ports Static, Dynamic, Offloaded, Quantized, Hybrid, etc. to use layers - Implements method/attr dispatch across layers to reduce boilerplate - Adds CacheProcessor hooks for offloading, quantization, etc. - Updates and passes tests * fix quantized, add tests * remove CacheProcessorList * raushan review, arthur review * joao review: minor things * remove cache configs, make CacheLayer a mixin (joaos review) * back to storage inside Cache() * remove cachebase for decorator * no more __getattr__ * fix tests * joaos review except docs * fix ast deprecations for python 3.14: replace node.n by node.value and use `ast.Constant` More verbose exceptions in `fix_docstring` on docstring formatting issues. * Revert "back to storage inside Cache()" This reverts commit 27916bc2737806bf849ce2148cb1e66d59573913. * cyril review * simplify cache export * fix lfm2 cache * HybridChunked to layer * BC proxy object for cache.key_cache[i]=... * reorder classes * bfff come on LFM2 * better tests for hybrid and hybridChunked * complete coverage for hybrid chunked caches (prefill chunking) * reimplementing HybridChunked * cyril review * fix ci * docs for cache refactor * docs * oopsie * oopsie * fix after merge * cyril review * arthur review * opsie * fix lfm2 * opsie2	2025-07-22 16:10:25 +02:00
Cyril Vallez	b16688e96a	General weight initialization scheme (#39579 ) * general + modulars from llama * all modular models * style and fix musicgen * fix * Update configuration_musicgen.py * Update modeling_utils.py	2025-07-22 16:04:20 +02:00
Ákos Hadnagy	015b62bf3e	Add AMD GPU expectations for LLaVA tests (#39486 ) * Add AMD GPU expectation to llava tests * FMT * Remove debug print * Address review comments	2025-07-22 14:01:54 +00:00
Arthur	efceeaf267	Kernels flash attn (#39474 ) * use partial to wrap around `transformers` utils! * try to refactor? * revert one wrong change * just a nit * push * reverter watever was wrong! * some nits * fixes when there is no attention mask * bring the licence back * some fixes * nit * style * remove prints * correct dtype * fa flags for testing * update * use paged attention if requested! * updates * a clone was needed, not sure why * automatically create cu seq lens when input is flash, this at least makes sure layers don't re-compute * simplify and improve? * flash attention is kinda broken on recent cuda version so allow the opportunity to use something else * fix! * protect kernels import * update * properly parse generation config being passed * revert and update * add two tests * some fixes * fix test FA2 * takes comment into account * fixup * revert changes * revert the clone, it is only needed because the metal kernel is not doing it? * [docs] update attention implementation and cache docs (#39547) * update docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applu suggestions --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix mps on our side for now * Update src/transformers/integrations/flash_paged.py * no qa --------- Co-authored-by: Vasqu <antonprogamer@gmail.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 15:41:06 +02:00
Ákos Hadnagy	b62557e712	Add AMD expectations to Mistral3 tests (#39481 ) Add AMD expectations to mistral3 tests	2025-07-22 15:40:16 +02:00
Raushan Turganbay	1806583390	[docs] Create page on inference servers with transformers backend (#39550 ) * draft docs on inference servers * Update docs/source/en/_toctree.yml Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * update * dic build failed * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/transformers_as_backend.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply last suggestions --------- Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 15:31:10 +02:00
Raushan Turganbay	cd98c1fee3	[docs] update attention implementation and cache docs (#39547 ) * update docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * applu suggestions --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-22 15:06:43 +02:00
Ákos Hadnagy	ef99537f37	Add AMD test expectations to DETR model (#39539 ) * Add AMD test expectations to DETR model * Fix baseline expectation * Address review comments * Make formatting a bit more consistent	2025-07-22 12:07:10 +00:00
Dominik Baran	30567c28e8	[timm_wrapper] add support for gradient checkpointing (#39287 ) * feat: add support for gradient checkpointing in TimmWrapperModel and TimmWrapperForImageClassification * ruff fix * refactor + add test for not supported model * ruff * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/timm_wrapper/modeling_timm_wrapper.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-07-22 11:07:52 +00:00
Wing Lian	a44dcbe513	Fixes needed for n-d parallelism and TP (#39562 ) Handle non-DTensors cases in TP Layers Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-07-22 10:24:59 +00:00
Ákos Hadnagy	0cae633ce1	Bump AMD container for 2.7.1 PyTorch (#39458 ) * Bump AMD container for 2.7.1 PyTorch * Forgot to update pinned packages	2025-07-22 12:11:38 +02:00
StevenBucaille	a88ea9cbc8	Add EfficientLoFTR model (#36355 ) * initial commit * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix: various typos, typehints, refactors from suggestions * fix: fine_matching method * Added EfficientLoFTRModel and AutoModelForKeypointMatching class * fix: got rid of compilation breaking instructions * docs: added todo for plot * fix: used correct hub repo * docs: added comments * fix: run modular * doc: added PyTorch badge * fix: model repo typo in config * fix: make modular * fix: removed mask values from outputs * feat: added plot_keypoint_matching to EfficientLoFTRImageProcessor * feat: added SuperGlueForKeypointMatching to AutoModelForKeypointMatching list * fix: reformat * refactor: renamed aggregation_sizes config parameter into q, kv aggregation kernel size and stride * doc: added q, kv aggregation kernel size and stride doc to config * refactor: converted efficientloftr implementation from modular to copied from mechanism * tests: overwrote batching_equivalence for "keypoints" specific tests * fix: changed EfficientLoFTRConfig import in test_modeling_rope_utils * fix: make fix-copies * fix: make style * fix: update rope function to make meta tests pass * fix: rename plot_keypoint_matching to visualize_output for clarity * refactor: optimize image pair processing by removing redundant target size calculations * feat: add EfficientLoFTRImageProcessor to image processor mapping * refactor: removed logger and updated attention forward * refactor: added auto_docstring and can_return_tuple decorators * refactor: update type imports * refactor: update type hints from List/Dict to list/dict for consistency * refactor: update MODEL_MAPPING_NAMES and __all__ to include LightGlue and AutoModelForKeypointMatching * fix: change type hint for size parameter in EfficientLoFTRImageProcessor to Optional[dict] * fix typing * fix some typing issues * nit * a few more typehint fixes * Remove output_attentions and output_hidden_states from modeling code * else -> elif to support efficientloftr * nit * tests: added EfficientLoFTR image processor tests * refactor: reorder functions * chore: update copyright year in EfficientLoFTR test file * Use default rope * Add docs * Update visualization method * fix doc order * remove 2d rope test * Update src/transformers/models/efficientloftr/modeling_efficientloftr.py * fix docs * Update src/transformers/models/efficientloftr/image_processing_efficientloftr.py * update gradient * refactor: removed unused codepath * Add motivation to keep postprocessing in modeling code * refactor: removed unnecessary variable declarations * docs: use load_image from image_utils * refactor: moved stage in and out channels computation to configuration * refactor: set an intermediate_size parameter to be more explicit * refactor: removed all mentions of attention masks as they are not used * refactor: moved position_embeddings to be computed once in the model instead of every layer * refactor: removed unnecessary hidden expansion parameter from config * refactor: removed completely hidden expansions * refactor: removed position embeddings slice function * tests: fixed broken tests because of previous commit * fix is_grayscale typehint * not refactoring * not renaming * move h/w to embeddings class * Precompute embeddings in init * fix: replaced cuda device in convert script to accelerate device * fix: replaced stevenbucaille repo to zju-community * Remove accelerator.device from conversion script * refactor: moved parameter computation in configuration instead of figuring it out when instantiating a Module * fix: removed unused attributes in configuration * fix: missing self * fix: refactoring and tests * fix: make style --------- Co-authored-by: steven <steven.bucaille@buawei.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-07-22 10:53:16 +01:00
Raushan Turganbay	3bc726b381	[gemma3] fix bidirectional image mask (#39396 ) * fix gemma3 mask * make compile happy, and use only torch ops * no full attention between images * update tests * fix tests * add a fast test	2025-07-22 10:04:56 +02:00
nlhm	fbeaf96f9e	Update OLMoE model card (#39344 ) * Update OLMoE model card * Checks Test * Add license and code * Update docs/source/en/model_doc/olmoe.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update olmoe.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-21 16:41:01 -07:00
Orion Weller	641aaed7c0	Update modernbertdecoder docs (#39453 ) * update docs with paper and real model * nit * Apply suggestions from code review Thanks to @stevhlui! Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Remove usage examples, add quantization --------- Co-authored-by: oweller2 <oweller2@dsailogin.mgmt.ai.cluster> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-07-21 16:40:22 -07:00
Anton Vlasjuk	049a674e68	[`CI`] Fix post merge ernie 4.5 (#39561 ) fix repo consistency	2025-07-21 20:56:24 +02:00
Yoni Gozlan	b3ebc761e2	[Fast image processors] Improve handling of image-like inputs other than images (segmentation_maps) (#39489 ) * improve handlike of other image-like inputs in fast image processors * fix issues with _prepare_images_structure * update sam image processor fast * use dict update	2025-07-21 14:12:14 -04:00
Anton Vlasjuk	b4115a426e	[`Ernie 4.5`] Add ernie text models (#39228 ) Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details * init * copied from remote * add proper structure and llama like structure * fixup * revert to state that works * get closer to llama * slow and steady * some removal * masks work * it is indeed the rope implementation, how dafuq does it mesh with the cache now hmm * nice * getting closer * closer to transformers style * let's simplify this, batching works now * simplified * working version with modular * it is indeed the rotation per weights, make it complete llama style * cleanup conversion, next to look at -> tokenizer * remove llama artefacts * fix modeling tests (common ones) * style * integration test + first look into tokenization (will need more work, focussing on modeling other models first) * style * working moe version, based on remote * lets keep it simple and go step by step - transformers annotations for modular and transformers style rope (complex view) * more cleanup * refactor namings and remove addition forXXX classes * our moe won't cut it it seems, correction bias seems to be missing in remote code version * tokenization change (remote) * our moe version works when adding normalization :D * cleanup moe * nits * cleanup modeling -> let's get to modular next * style * modular v1 * minor things + attempt at conversion (which doesn't work) * no conversion follow glm, fixup modular and other nits * modular cleanup * fixes * tests, tests, tests + some moe dtype forcing * simplify modular, fix fatal fa2 bug, remaining tests * fix import issue? * some initial docs, fix bnb faulty behavior --> needs to fix some tests because of gate needing to be float * fix sdpa test, load on init dtype only * fixup post merge * style * fix doc links * tokenization cleanup beginnings * simplify tokenizer by a lot as its basically llama * tokenizer is full llama with different defaults + extra special tokens * sync og special tokens of ernie * fix decoding with numbers (also in remote done what a timing), begin of tok tests * align with remote and preserve special tokens, adjust tests to ernie legacy behavior, warning for questionable behavior (also in llama) * nits * docs * my daily post merge it is * check * tokenization update with explanations and conversion script * review on modular (til), revert some tokenizer things i did prior, remove mtp comment (low prio) * post merge fixes * fixup tokenization, llama fast is the way to go * more fixups * check * import fixes * correction bias following the paddle code * fix * fix TP plan, fix correction bias sharding during forward * style * whoops * fix tied weights * docs and last nit * license * flasky tests * move repo id, update when merged on the hub v4.53.2-Ernie-4.5-preview	2025-07-21 19:51:49 +02:00
Pablo Montalvo	69b158260f	Refactor embedding input/output getter/setter (#39339 ) * simplify common get/set * remove some noise * change some 5 years old modeling utils * update examples * fix copies * revert some changes * fixes, gah * format * move to Mixin * remove smolvlm specific require grad * skip * force defaults * remodularise some stuff * remodularise more stuff * add safety for audio models * style * have a correct fallback, you daft donkey * remove this argh * change heuristic for audio models * fixup * revert * this works * revert again * 🧠 * aaah ESM has two modelings aaah * add informative but short comment * add `input_embed_layer` mixin attribute * style * walrus has low precedence * modular fix * this was breaking parser	2025-07-21 18:18:14 +02:00
김민서	2da97f0943	🌐 [i18n-KO] Translated `perf_infer_gpu_multi.md` to Korean (#39441 ) * docs: ko: perf_infer_gpu_many.md * feat: nmt draft * docs: refine KO translation and enhance naturalness * docs: add missing TOC to documentation * Align toctree and filename with original: perf_infer_gpu_multi Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> * Refine Korean translation * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com> * Update docs/source/ko/perf_infer_gpu_multi.md Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> --------- Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com> Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com>	2025-07-21 09:14:15 -07:00
Yoni Gozlan	82807e56b1	[Fast image processor] refactor fast image processor glm4v (#39490 ) refactor fast image processor glm4v	2025-07-21 11:18:46 -04:00
Wing Lian	4b4f04fcca	fix ndim check of device_mesh for TP (#39538 )	2025-07-21 13:09:33 +00:00
Manuel de Prada Corral	1aa7256f01	Refactor `MambaCache` to `modeling_mamba.py` (#38086 ) * Refactor MambaCache to modeling_mamba.py (parity with Zamba) * ruff * fix dummies * update * update * remove mamba ref in cache tests * remove cache_implementation from tests * update * ruff * ruff * sneaky regression * model consistency * fix test_multi_gpu_data_parallel_forward * fix falcon slow tests * ruff * ruff * add sample false * try to fix slow tests * Revert "fix test_multi_gpu_data_parallel_forward" This reverts commit 66b7162c7c5c5ce8a73ccf48cffc8a96343ebb33. * fix tests on nvidia t4, remove dataparallel tests from mamba * ruff * remove DDP tests from mamba and falcon_mamba * add explicit error for MambaCache * mamba2 also needs to init cache in prepare_inputs_for_generation * ruff * ruff * move MambaCache to its own file * ruff * unprotected import fix * another attempt to fix unprotected imports * Revert "another attempt to fix unprotected imports" This reverts commit 2338354fcab630de5899321f5daced5fb312c2a2. * fixing unprotected import, attempt 3 * Update src/transformers/cache_utils.py * ruff's fault * fix arthur review * modular falcon mamba * found a hack * fix config docs * fix docs * add export info * merge modular falcon branch * oopsie * fix fast path failing * new approach * oopsie * fix types * Revert new pragma in modular This reverts commit 80b1cf160ee251536f07c40b8a0857d499e70db6. * trying another modular workaround * review & fix ci * oopsie * clear prepare_inputs on mamba/mamba2/falcon_mamba	2025-07-21 14:59:36 +02:00
st81	a419a40234	Fix Docstring of BarkProcessor (#39546 ) * Fix Docstring of BarkProcessor * Fix typo * Add type hint of return value for BarkProcessor.__call__	2025-07-21 12:56:44 +00:00
Wang, Yi	9323d0873c	use the enable_gqa param in torch.nn.functional.scaled_dot_product_at… (#39412 ) * use the enable_gqa param in torch.nn.functional.scaled_dot_product_attention Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * ci failure fix Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add check Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fix ci failure Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * refine code, extend to cuda Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * refine code Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fix review comments Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * refine the PR Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-07-21 14:46:43 +02:00
BUI Van Tuan	6b3a1f2f51	Fix missing initializations for models created in 2023 (#39239 ) * fix SwiftFormer * fix Kosmos2 * fix Owlv2 * fix Sam * fix Vits * fix Pvt * fix MobileViTV2 * fix PatchTST * fix Bros * fix Informer * fix BridgeTower * fix Mra and Yoso * fix Rwkv * fix EfficientNet * fix NllbMoe * fix Tvp * fix Clap * fix Autoformer * fix SwiftFormer * fix Mgpstr * fix Align * fix VitMatte * fix SpeechT5 * add conditional check for parameters * fix SpeechT5 * fix TimmBackbone and Clvp * fix SwiftFormer * fix SeamlessM4T and SeamlessM4Tv2 * fix Align * fix Owlv2 and OwlViT * add reviewed changes * add reviewed changes * fix typo --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>	2025-07-21 14:43:52 +02:00

1 2 3 4 5 ...

19755 Commits