HuggingFace_transformer

Author	SHA1	Message	Date
Arthur	9c641dc161	v4.54.1 Some checks failed Secret Leaks / trufflehog (push) Has been cancelled Details	2025-07-29 17:32:03 +02:00
Arthur	3fd456b200	v4.54-release Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details	2025-07-25 20:44:40 +02:00
Lysandre Debut	f90de364c2	Rename huggingface_cli to hf (#39630 ) * Rename huggingface_cli to hf * hfh	2025-07-25 14:10:04 +02:00
Quentin Lhoest	91f591f7bc	Make pytorch examples UV-compatible (#39635 ) * update release.py * add uv headers in some pytorch examples * rest of pytorch examples * style	2025-07-25 10:46:22 +02:00
Raushan Turganbay	eb1a007f7f	Rename `supports_static_cache` to `can_compile_fullgraph` (#39505 ) * update all * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * apply suggestions * fix copies --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-07-23 09:35:18 +00:00
Pablo Montalvo	69b158260f	Refactor embedding input/output getter/setter (#39339 ) * simplify common get/set * remove some noise * change some 5 years old modeling utils * update examples * fix copies * revert some changes * fixes, gah * format * move to Mixin * remove smolvlm specific require grad * skip * force defaults * remodularise some stuff * remodularise more stuff * add safety for audio models * style * have a correct fallback, you daft donkey * remove this argh * change heuristic for audio models * fixup * revert * this works * revert again * 🧠 * aaah ESM has two modelings aaah * add informative but short comment * add `input_embed_layer` mixin attribute * style * walrus has low precedence * modular fix * this was breaking parser	2025-07-21 18:18:14 +02:00
Sai-Suraj-27	970d9a75ce	Raise `TypeError` instead of ValueError for invalid types (#38660 ) * Raise TypeError instead of ValueError for invalid types. * Removed un-necessary changes. * Resolved conflicts * Code quality * Fix failing tests. * Fix failing tests.	2025-07-21 12:42:00 +00:00
Raushan Turganbay	8c102e2eb1	Rename `_supports_flash_attn_2` in examples and tests (#39471 ) * delete `_supports_flash_attn_2` from examples and tests * simplify docs	2025-07-21 14:02:57 +02:00
Yuanyuan Chen	60b5471da3	Enable some ruff checks for performance and readability (#39383 ) * Fix inefficient sequence tests Signed-off-by: cyy <cyyever@outlook.com> * Enable PERF102 Signed-off-by: cyy <cyyever@outlook.com> * Enable PLC1802 Signed-off-by: cyy <cyyever@outlook.com> * Enable PLC0208 Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-17 13:21:59 +00:00
Hosein Rezaei	d8e05951b8	Fix bugs in pytorch example run_clm when streaming is enabled (#39286 )	2025-07-15 15:37:28 +02:00
eromomon	903944a411	[examples] fix do_reduce_labels argument for run_semantic_segmentation_no_trainer (#39322 ) * no use do_reduce_labels argument in model * use do_reducer_labels in AutoImageProcessor	2025-07-14 10:16:49 +00:00
eromomon	af74ec65a7	Update Readme to Run Multiple Choice Script from Example Directory (#39323 ) * Update Readme to run in current place * Update Readme files to execute PyTorch examples from their respective folders	2025-07-11 10:58:26 -07:00
Cyril Vallez	1cefb5d788	[modular] Allow method with the same name in case of @property decorator (#39308 ) * fix * add example * fix * Update modular_model_converter.py	2025-07-09 15:46:53 +02:00
Quentin Lhoest	1ecd52e50a	Add torchcodec in docstrings/tests for `datasets` 4.0 (#39156 ) * fix dataset run_object_detection * bump version * keep same dataset actually * torchcodec in docstrings and testing utils * torchcodec in dockerfiles and requirements * remove duplicate * add torchocodec to all the remaining docker files * fix tests * support torchcodec in audio classification and ASR * [commit to revert] build ci-dev images * [commit to revert] trigger circleci * [commit to revert] build ci-dev images * fix * fix modeling_hubert * backward compatible run_object_detection * revert ci trigger commits * fix mono conversion and support torch tensor as input * revert map_to_array docs + fix it * revert mono * nit in docstring * style * fix modular --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-08 17:06:12 +02:00
Cyril Vallez	056fa73fae	[modular] Simplify logic and docstring handling (#39185 ) * simplify a lot * Update modular_model_converter.py * finalize * remove outdated functions * apply it * and examples	2025-07-07 14:52:57 +02:00
Cyril Vallez	5348fbc005	[modular] Follow global indexing and attribute setting, and their dependencies (#39180 ) * export global indexing statements * add example * style * examples	2025-07-07 14:36:43 +02:00
Arthur	ca7e1a3756	Refactor the way we handle outputs for new llamas and new models (#39120 ) * just update 2 files * update other models as well just making fix-copies * also add the changes needed to modeling utils * put this on the pretrained model instead * nits and fixes * update generic, fix to use config value * update other modelings * use transformers kwargs instead * update * update * update other models * update * updates * update * update * update * fix * finally * very small nits * this fixes more tests * fix other models as well! * update modularqwen2 * update models based on qwen2 * update * update * remove the *flash stuff in favor of noraml kwargs update * propagate gemma? * remove output attentions * propagate * support cross attention edge case * same * test this * fixes * more fix * update * update * fix conflicts * update * fix emu3 * fix emu3 * move the fix a bit * quel enfer * some fixes, loss_kwargs should never had been * finish fixing gemma3n * fix small lm3 * fix another one * fix csm now * fux csm and mistral * fix mistral now * small fixes * fix janusss * only for some models * fixup * phix phi3 * more fixes? * dose this fix it? * update * holy shit it was just graph breaks * protect torch * updates * fix samhq? * fix moonshine * more moonshine fixes, 3 failures left! * nits * generic needs to support more * more fixes to moonshine! * fix cross attention outputs! * fix csm! * nits * fix stupid kosmos2 * current updates * fixes * use output recorder? * nicer! * a little bit of magic * update * fix protect * fix * small fixes * protect import * fix a bunch of more models * fix fixups * fix some of the last ones * nit * partly fix phi * update * fix import path * make something that is fullgraph compatible just to be sure * typing was wrong on llama so the rest was wrong as well * fucking ugly but at least it is still exportable * syle * supposed to fix moonshine, it still breaks * fix some default * fix the last bits of sam * update samhq * more fixes to am hq * nit * fix all output+hidden states and output_attentions! * fix? * fix diffllama * updates to fix initialization on the sam pips * ups there was a bug * fix the last sam hq test * fix gotocr * fix gotocr2! * fixes * skip stupid tests * there was one left :) * fixup * fix fix copies issues with this test file * fix copies for sam_hq * rm some comments * skip 2 more failing tests * fix * fix everything * Apply suggestions from code review Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * add more doc! * fix public init * fix modular qwen3 --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-07-05 11:34:28 +02:00
Lysandre	5154497607	Dev version	2025-06-26 18:04:36 +02:00
Quentin Lhoest	858f9b71a8	Remove script datasets in tests (#38940 ) * remove trust_remote_code * again * Revert "Skip some tests for now (#38931)" This reverts commit `31d30b7224`. * again * style * again * again * style * fix integration test * fix tests * style * fix * fix * fix the last ones * style * last one * fix last * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-25 14:31:20 +00:00
Cyril Vallez	e1e11b0299	Fix undeterministic order in modular dependencies (#39005 ) * sort correctly * Update modeling_minimax.py * Update modular_model_converter.py	2025-06-24 17:04:33 +02:00
Mylon Jones	9f42c1f192	Added scikit-learn to the example image-classification requirements.txt (#37506 ) Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-06-24 15:24:02 +02:00
Dianana	4f650040a6	Removing extra space in large command for speech-pretraining example (#38705 ) Removing extra space in Large command	2025-06-24 12:24:56 +00:00
Yih-Dar	31d30b7224	Skip some tests for now (#38931 ) * try * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-20 11:05:49 +02:00
Matt	508a704055	No more Tuple, List, Dict (#38797 ) * No more Tuple, List, Dict * make fixup * More style fixes * Docstring fixes with regex replacement * Trigger tests * Redo fixes after rebase * Fix copies * [test all] * update * [test all] * update * [test all] * make style after rebase * Patch the hf_argparser test * Patch the hf_argparser test * style fixes * style fixes * style fixes * Fix docstrings in Cohere test * [test all] --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-17 19:37:18 +01:00
Quentin Gallouédec	de24fb63ed	Use HF papers (#38184 ) * Use hf papers * Hugging Face papers * doi to hf papers * style	2025-06-13 11:07:09 +00:00
Cyril Vallez	4b8ec667e9	Remove all traces of `low_cpu_mem_usage` (#38792 ) * remove it from all py files * remove it from the doc * remove it from examples * style * remove traces of _fast_init * Update test_peft_integration.py * CIs	2025-06-12 16:39:33 +02:00
Francisco R Castro Garcia	cb4c56ce0d	Fix typo in Language Modeling example scripts and update TPU type (#38652 ) * Fix typo that prevents the examples to be run correctly * return .TPU in accelerator.distributedtype comparison	2025-06-10 13:43:35 +00:00
Anthony	19224c3642	fix: "check out" as verb (#38678 ) "check out" as verb	2025-06-09 14:07:31 +00:00
dependabot[bot]	65f5fa71cd	Bump torch from 2.6.0 to 2.7.1 in /examples/flax/vision (#38606 ) Bumps [torch](https://github.com/pytorch/pytorch) from 2.6.0 to 2.7.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v2.6.0...v2.7.1) --- updated-dependencies: - dependency-name: torch dependency-version: 2.7.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-05 13:38:02 +01:00
Sai-Suraj-27	a510be20f3	Updated deprecated typing imports with equivalents for Python 3.9+ (#38546 ) * Replace deprecated typing imports with collections.abc equivalents for Python 3.9+ * Fixed code quality --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-06-04 16:57:23 +00:00
Raushan Turganbay	bf68dd9e6e	[tests] expand flex-attn test for vision models (#38434 ) * expand the test for VLMs * typo * mark models `supports_flex` + expand test for additional kwargs * flex attn for refactored vision models * fix copies * fix * unskip * style * address comments	2025-06-03 07:40:44 +00:00
dependabot[bot]	f962c862ff	Bump torch from 2.2.0 to 2.6.0 in /examples/flax/vision (#37618 ) Bumps [torch](https://github.com/pytorch/pytorch) from 2.2.0 to 2.6.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v2.2.0...v2.6.0) --- updated-dependencies: - dependency-name: torch dependency-version: 2.6.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-30 14:04:52 +01:00
Arthur	211f2b0875	Add CB (#38085 ) * stash for now * initial commit * small updated * up * up * works! * nits and fixes * don't loop too much * finish working example * update * fix the small freeblocks issue * feat: stream inputs to continuous batch * fix: update attn from `eager` to `sdpa` * refactor: fmt * refactor: cleanup unnecessary code * feat: add `update` fn to `PagedAttentionCache` * feat: broken optimal block size computation * fix: debugging invalid cache logic * fix: attention mask * refactor: use custom prompts for example * feat: add streaming output * fix: prefill split refactor: add doc strings and unsound/redundant logic fix: compute optimal blocks logic * fix: send decoded tokens when `prefilling_split` -> `decoding` * refactor: move logic to appropriate parent class * fix: remove truncation as we split prefilling anyways refactor: early return when we have enough selected requests * feat: add paged attention forward * push Ggraoh> * add paged sdpa * update * btter mps defaults * feat: add progress bar for `generate_batch` * feat: add opentelemetry metrics (ttft + batch fill %age) * feat: add tracing * Add cuda graphs (#38059) * draft cudagraphs addition * nits * styling * update * fix * kinda draft of what it should look like * fixes * lol * not sure why inf everywhere * can generate but output is shit * some fixes * we should have a single device synch * broken outputs but it does run * refactor * updates * updates with some fixes * fix mask causality * another commit that casts after * add error * simplify example * update * updates * revert llama changes * fix merge conflicts * fix: tracing and metrics * my updates * update script default values * fix block allocation issue * fix prefill split attnetion mask * no bugs * add paged eager * fix * update * style * feat: add pytorch traces * fix * fix * refactor: remove pytorch profiler data * style * nits * cleanup * draft test file * fix * fix * fix paged and graphs * small renamings * cleanups and push * refactor: move tracing and metrics logic to utils * refactor: trace more blocks of code * nits * nits * update * to profile or not to profile * refactor: create new output object * causal by default * cleanup but generations are still off for IDK what reason * simplifications but not running still * this does work. * small quality of life updates * nits * updaet * fix the scheduler * fix warning * ol * fully fixed * nits * different generation parameters * nice * just style * feat: add cache memory usage * feat: add kv cache free memory * feat: add active/waiting count & req latency * do the sampling * fix: synchronize CUDA only if available and improve error handling in ContinuousBatchingManager * fix on mps * feat: add dashboard & histogram buckets * perf: improve waiting reqs data structures * attempt to compile, but we should only do it on mps AFAIK * feat: decouple scheduling logic * just a draft * c;eanup and fixup * optional * style * update * update * remove the draft documentation * fix import as well * update * fix the test * style doomed --------- Co-authored-by: Luc Georges <luc.sydney.georges@gmail.com>	2025-05-22 17:43:48 +02:00
Arthur	e288ee00d8	tp plan should not be NONE (#38255 ) * accept custom device_mesh * fix device_map * assert that num_heads % tp_size == 0 * todo. * ReplicateParallel * handle tied weights * handle dtensor in save_pretrained with safe_serialization * tp test works * doesnt work * fix shard_and_distribute_module's rank should be local_rank * tp=4 is correct * dp+tp is broken * todo allreduce with dtensors on another dim is annoying * workaround to sync dp grads when using dtensors * loading a checkpoint works * wandb and compare losses with different tp/dp * cleaning * cleaning * . * . * logs * CP2 DP2 no mask works after commenting attn_mask and is_causal from scaled_dot_product_attention * DP=2 TP=2 now works even with tied embeddings * model.parameters() and model.module.parameters() are empty.. * reformat sanity_check_tensor_sync * set atol=1e-4 for CP to pass * try populate _parameters from named_modules * refactors TP2 DP2 works CP2 DP2 works * is_causal=True and pack sequences, no attn mask, and preshuffle dataset * fix packing * CP=4 doesn't work * fix labels and position_ids for CP * DP CP works with transformers 🥳🥳🥳 * refactor * add example cp * fixup * revert sdpa changes * example cleared * add CP, DP to the mesh init * nit * clean * use `ALL_PARALLEL_STYLES` * style * FSDP works * log on 1 rank * . * fix? * FSDP1 also has .parameters() bug * reported gradnorm when using FSDP1 is wrong, but loss is correct so it's okay * . * style and fixup * move stuff around * fix tests * style * let's make it a check * add missing licences * warning should be an info * tp plan should not be NONE * test all * god damn it * test all --------- Co-authored-by: nouamanetazi <nouamane98@gmail.com>	2025-05-21 10:22:38 +02:00
Lysandre Debut	711d78d104	Revert parallelism temporarily (#38240 ) * Revert "Protect ParallelInterface" This reverts commit `cb513e35f9`. * Revert "parallelism goes brrr (#37877)" This reverts commit `1c2f36b480`. * Empty commit	2025-05-20 22:43:04 +02:00
Lysandre	f4ef41c45e	v4.53.0.dev0	2025-05-20 18:12:56 +02:00
Nouamane Tazi	1c2f36b480	parallelism goes brrr (#37877 ) * accept custom device_mesh * fix device_map * assert that num_heads % tp_size == 0 * todo. * ReplicateParallel * handle tied weights * handle dtensor in save_pretrained with safe_serialization * tp test works * doesnt work * fix shard_and_distribute_module's rank should be local_rank * tp=4 is correct * dp+tp is broken * todo allreduce with dtensors on another dim is annoying * workaround to sync dp grads when using dtensors * loading a checkpoint works * wandb and compare losses with different tp/dp * cleaning * cleaning * . * . * logs * CP2 DP2 no mask works after commenting attn_mask and is_causal from scaled_dot_product_attention * DP=2 TP=2 now works even with tied embeddings * model.parameters() and model.module.parameters() are empty.. * reformat sanity_check_tensor_sync * set atol=1e-4 for CP to pass * try populate _parameters from named_modules * refactors TP2 DP2 works CP2 DP2 works * is_causal=True and pack sequences, no attn mask, and preshuffle dataset * fix packing * CP=4 doesn't work * fix labels and position_ids for CP * DP CP works with transformers 🥳🥳🥳 * refactor * add example cp * fixup * revert sdpa changes * example cleared * add CP, DP to the mesh init * nit * clean * use `ALL_PARALLEL_STYLES` * style * FSDP works * log on 1 rank * . * fix? * FSDP1 also has .parameters() bug * reported gradnorm when using FSDP1 is wrong, but loss is correct so it's okay * . * style and fixup * move stuff around * fix tests * style * let's make it a check * warning should be an info --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-05-20 16:22:52 +02:00
Yong Hoon Shin	555715f418	Fix broken example generation script for Llama3 (#38062 ) Fix broken example generation script for llama3	2025-05-20 10:53:43 +02:00
Fanli Lin	8fb60bf6be	add timeout for downloading the `librispeech_asr` dataset (#38073 ) * add timeout * change 10 to 60	2025-05-13 11:50:12 +01:00
co63oc	5b573bebb9	Fix typos in strings and comments (#37910 )	2025-05-01 14:58:58 +01:00
Lysandre Debut	d538293f62	Transformers cli clean command (#37657 ) * transformers-cli -> transformers * Chat command works with positional argument * update doc references to transformers-cli * doc headers * deepspeed --------- Co-authored-by: Joao Gante <joao@huggingface.co>	2025-04-30 12:15:43 +01:00
Cyril Vallez	4602059aae	[modular] Fix the prefix-based renaming if the old and new model share a common name suffix (#37829 ) * first try * Fix and set examples * style * fix * Update modular_test_detr.py * Update image_processing_new_imgproc_model.py * Update modular_model_converter.py	2025-04-29 10:43:23 +02:00
Yuanyuan Chen	da4ff2a5f5	Add Optional to remaining types (#37808 ) More Optional typing Signed-off-by: cyy <cyyever@outlook.com>	2025-04-28 14:20:45 +01:00
Matt	9ec8be56dd	TransfoXL is deprecated, don't keep it in tested examples! (#37707 ) * TransfoXL is deprecated, so we should remove it from examples that get tested * Remove the tokenizer too * Trigger tests	2025-04-23 14:59:38 +01:00
Ken J	ca4c114dc4	Add counters for dataset classes (#37636 ) * add counters for dataset classes * fix failed code style	2025-04-22 17:30:43 +01:00
jeffhataws	964a1b6b7d	Fix ValueError when eval_do_concat_batches=False with examples (#37621 ) https://github.com/huggingface/transformers/issues/37593 Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-22 12:13:25 +02:00
we1559	b0c6ff5e13	fix issue that some example with no trainer use accelerator.end_train… (#37435 ) * fix issue that some example with no trainer use accelerator.end_training in a wrong way * reformat code --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-18 17:59:42 +02:00
Cyril Vallez	8ab296501a	Remove deprecation warning for `num_logits_to_keep` (#37149 ) * remove everything * style	2025-04-14 19:08:45 +02:00
cyyever	0fb8d49e88	Use Python 3.9 syntax in examples (#37279 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-04-07 12:52:21 +01:00
Lysandre	d1b92369ca	v4.52.0.dev0	2025-04-05 22:04:21 +02:00

1 2 3 4 5 ...

2632 Commits