HuggingFace_transformer

Author	SHA1	Message	Date
Driss Guessous	e8d960329e	Add option for ao base configs (#36526 )	2025-03-19 14:59:47 +01:00
Arthur	fef8b7f8e9	Add attention visualization tool (#36630 ) * add utils fiel * style * nits * nits * update * updaets * update * fix init issues * big updates * nits * nits? * small updates * nites * there were still some models left * style * fixes * updates * nits _ fixes * push changes * update * update * update * Apply suggestions from code review Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * style * styling and return a string for testing * small updates * always biderectional for now * update --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-03-19 13:58:46 +01:00
Joao Gante	0fe0bae0a8	[Generation] remove leftover code from end-to-end compilation (#36685 )	2025-03-19 11:28:33 +00:00
Mohamed Mekkouri	a861db01e5	Fix Device map for bitsandbytes tests (#36800 ) fix	2025-03-19 11:57:13 +01:00
Yih-Dar	b9374a0763	Remove `dist": "loadfile"` for `pytest` in CircleCI jobs (#36811 ) * fasterrrrr * avoid crash in example jobs * avoid crash in TF example jobs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-19 11:15:09 +01:00
Yao Matrix	4fa91b1be5	fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model (#36572 ) * fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model * follow Marc's suggestion to use _tie_weights to fix Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix review comments. Signed-off-by: N <matrix.yao@intel.com> * fix quality Signed-off-by: N <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: N <matrix.yao@intel.com>	2025-03-19 10:48:47 +01:00
ivarflakstad	706703bba6	Expectations test utils (#36569 ) * Add expectation classes + tests * Use typing Union instead of \| * Use bits to track score in properties cmp method * Add exceptions and tests + comments * Remove compute cap minor as it is not needed currently * Simplify. Remove Properties class * Add example Exceptions usage * Expectations as dict subclass * Update example Exceptions usage * Refactor. Improve type name. Document score fn. * Rename to DeviceProperties.	2025-03-18 23:39:50 +01:00
Joao Gante	179d02ffb8	[generate] ✨ vectorized beam search ✨ (#35802 )	2025-03-18 18:39:36 +00:00
Yoni Gozlan	12f2ebef63	Support custom dosctrings in modular (#36726 ) * Override docstrings in modular if not none * Update doc	2025-03-18 14:00:54 -04:00
Gar	00915d3041	Fix chameleon's TypeError because inputs_embeds may None (#36673 ) * fix chameleon TypeError when inputs_embeds is None * reformat * hotfix	2025-03-18 18:59:30 +01:00
Marc Sun	14b597f518	Fix casting dtype for qunatization (#36799 ) * fix * remove print	2025-03-18 18:46:03 +01:00
Yoni Gozlan	30580f035b	Fix Mistral3 tests (#36797 ) * fix processor tests * fix modeling tests * fix test processor chat template * revert modeling test changes	2025-03-18 13:08:12 -04:00
Cyril Vallez	db1d4c5a0b	Loading optimizations (#36742 ) * improvements * Update modeling_utils.py * add some doc about loading * Update modeling_utils.py	2025-03-18 16:38:44 +01:00
Yih-Dar	7baf00089a	Update SHA for `tj-actions/changed-files` (#36795 ) * trigger * trigger --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-18 16:19:39 +01:00
Marc Sun	3017536ebf	fix hqq due to recent modeling changes (#36771 ) * fix-hqq * style * test	2025-03-18 12:20:27 +01:00
Cyril Vallez	e959530b8f	Add Mistral3 (#36790 ) Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details * initial start * style and dummies * Create convert_mistral3_weights_to_hf.py * update * typo * typo * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * update * update * Update image_processing_mistral3.py * Update convert_mistral3_weights_to_hf.py * fix patch merger * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * up * update modular to fit * style * Update convert_mistral3_weights_to_hf.py * typo * Update modular_mistral3.py * simplify a lot all shape shenanigans * simplify * add working test processor * Add partially working common modeling tests * All tests working and remove mistral3 image processors * add docs and fixup * fix inference with image size >1540 * 🚨fix test image proc pixtral * Remove vision_feature_select_strategy * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * Update convert_mistral3_weights_to_hf.py * clean * fix test checkpoints * Update test_modeling_mistral3.py * Update test_modeling_mistral3.py * style * Use Pixtral processor * up * finish cleaning processor to use pixtral directly * Update __init__.py * Update processing_pixtral.py * doc * Update __init__.py * Update mistral3.md * Update _toctree.yml --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co> Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com> v4.49.0-Mistral-3	2025-03-18 12:04:42 +01:00
Lysandre Debut	bd92073692	Fix gemma3_text tokenizer in mapping (#36793 )	2025-03-18 11:50:22 +01:00
Zebin	7426d02ea8	Fixing typo in gemma3 image_processor_fast and adding a small test (#36776 ) Co-authored-by: zebz13 <zeb@fedora> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-18 11:35:06 +01:00
Afanti	19b9d8ae13	chore: fix typos in tests directory (#36785 ) * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory * chore: fix typos in tests directory	2025-03-18 10:31:13 +01:00
Afanti	7f5077e536	fix typos in the tests directory (#36717 )	2025-03-17 17:45:57 +00:00
Daniel Kleine	cbfb8d7b27	doc: Clarify `is_decoder` usage in PretrainedConfig documentation (#36724 ) * fix: clarify decoder usage in PretrainedConfig documentation * Apply suggestions from code review updated doc Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-03-17 09:40:25 -07:00
Steven Liu	ac1a1b66b9	[docs] Update README (#36265 ) * update * feedback * feedback * update versions	2025-03-17 09:37:19 -07:00
Joao Gante	cff4caa0c1	[CI] remove redundant checks in `test_eager_matches_sdpa_inference` (#36740 )	2025-03-17 16:29:18 +00:00
Christopher Akiki	e3af4fec91	[MINOR:TYPO] Update hubert.md (#36733 ) * [MINOR:TYPO] Update hubert.md - typo fix (wave2vec instead of hubert) - make code snippet copiable and runnable * Run tests	2025-03-17 09:07:51 -07:00
Petr Kuderov	c8a2b25f91	Fix `TrainingArguments.torch_empty_cache_steps` post_init check (#36734 ) Mistaken use of De Morgan's law. Fixed "not (X or Y)" to correct "not (X and Y)" check to raise a ValueError. Added corresponding test to check "positive int or None" condition. Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-17 16:09:46 +01:00
Sambhav Dixit	8e67230860	Fix test isolation for clear_import_cache utility (#36345 ) * test fixup * test fixup * fixing tests for unused imports * style fixes * fix * style fixes * styke fix * remove isolated module cache * rm custom subprocess defination * run using exsiting fn * style fixup * make fixup * remove redundant comments * rm redundat skipif + style changes	2025-03-17 16:09:09 +01:00
jiqing-feng	27361bd218	fix xpu tests (#36656 ) * fix awq xpu tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix llava next video bnb tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:57:49 +01:00
Fredrik Norén	da7d64f4ff	Allow ray datasets to be used with trainer (#36699 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-17 15:44:47 +01:00
jiqing-feng	2256875a77	fix can_generate (#36570 ) * fix can_generate Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix can generate for speecht5 and blip Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix speecht5 tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-03-17 14:56:18 +01:00
Marc Sun	9e94801146	enable/disable compile for quants methods (#36519 ) * disable compile for most quants methods * fix * Update src/transformers/generation/configuration_utils.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update tests/quantization/bnb/test_mixed_int8.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * changes from joao suggestions --------- Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-17 11:38:21 +01:00
Armaghan Shakir	c53d53da89	🚨🚨🚨 Fix sdpa in SAM and refactor relative position embeddings (#36422 ) * fall back to eager if output_attentions * improve relative position embeddings * run modular on got_ocr2 * run-slow: sam * fix run-length encoding * fix tf processor errors * update tf_sam * fix compile error * re-run tests	2025-03-17 09:39:52 +00:00
Joao Gante	fc8764c9a6	[Generation, Gemma 3] When passing a custom `generation_config`, overwrite default values with the model's base `generation_config` (#36684 )	2025-03-15 12:40:09 +00:00
Guillaume LEGENDRE	f263e88dcf	Update self-push-caller.yml	2025-03-15 11:32:04 +01:00
Ilyas Moutawwakil	6f3e0b68e0	Fix grad accum arbitrary value (#36691 )	2025-03-14 22:03:01 +01:00
Cyril Vallez	2c2495cc7b	Fix post_init() code duplication (#36727 ) * Update modeling_utils.py * CIs	2025-03-14 17:36:02 +01:00
MaCAT	25992b493c	🌐 [i18n-KO] Translated codegen.md to Korean (#36698 ) * Initial translation * Add _toctree.yml	2025-03-14 09:31:18 -07:00
Joao Gante	42ebb6c23e	[tests] Parameterized `test_eager_matches_sdpa_inference` (#36650 )	2025-03-14 14:41:27 +00:00
Matt	9215cc62d4	Try working around the processor registration bugs (#36184 ) * Try working around the processor registration bugs * oops * Update error message * Clarify error * Docstring docstring docstring * The extra content is indexed by config class, so let's grab some values out of there * Commit my confusion as a TODO * Resolve my confusion * Cleanup and mostly revert to the original * Better autoclass fallback * Don't nest f-strings you lunatic * Clearer error message * Less getattr() * Revert a lot of changes to try a different approach! * Try the global registry * Check the dynamic list as well as the transformers root * Move the dynamic list somewhere safer * Move the dynamic list somewhere even safer * More import cleanup * Simplify all the register_for_auto_class methods * Set _auto_class in the register() methods * Stop setting the cls attribute in register() * Restore specifying the model class for Model derivatives only * Fix accidentally taking the .__class__ of a class * Revert register_for_auto_class changes * Fix get_possibly_dynamic_module * No more ALL_CUSTOM_CLASSES * Fix up get_possibly_dynamic_module as well * Revert unnecessary formatting changes * Trigger tests	2025-03-14 13:56:21 +00:00
Sean (Seok-Won) Yi	691d1b52c3	Fix/best model checkpoint fix (#35885 ) * Set best_model_checkpoint only when ckpt exists. Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists. * Added best_global_step to TrainerState. * Added tests for best_model_checkpoint. * Fixed hard-coded values in test to prevent fail. * Added helper func and removed hard-coded best_step. * Added side effect patch generator for _eval. * Added evaluate side effect func. * Removed erroneous patching. * Fixed minor bug. * Applied Ruff. * Fixed Ruff problem in make style. * Used Trainer.set_initial_training_values.	2025-03-14 14:24:53 +01:00
Joao Gante	3bd1a0ddf1	[model loading] don't `gc.collect()` if only 1 shard is used (#36721 ) * don't gc collect if 1 shard is used * delete state dict anyways	2025-03-14 12:56:56 +00:00
Matt	8cb522b419	Cleanup the regex used for doc preprocessing (#36648 ) * Cleanup the regex used for doc preprocessing * Run tests	2025-03-14 12:18:49 +00:00
Matt	72861e11eb	Make the flaky list a little more general (#36704 ) * Make the flaky list a little more general * Trigger tests * Make the flaky list a little more general	2025-03-14 12:15:32 +00:00
Kingsley	53742b11f5	Gemma3 processor typo (#36710 ) * fix typo when is on * tiny * add test and remove 'text_crops' * lint	2025-03-14 13:07:55 +01:00
Yoni Gozlan	69bc848480	Add support for fast image processors in add-new-model-like CLI (#36313 ) * add support for fast image processors in add-new-model-like * fix header not found add-fast-image-processor-cli * Encourage adding fast image processor * nit * start improve doc * update docs * make requested modifs	2025-03-13 14:16:37 -04:00
Matt	48ef468c74	Final CI cleanup (#36703 ) * make fixup * make fixup * Correct skip decorator * Add TODOs * add is_flaky() parentheses	2025-03-13 17:26:09 +00:00
Isotr0py	b070025aa6	Add GGUF support to T5-Encoder (#36700 ) * add gguf support to t5encoder Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * remove gguf from model_kwargs Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-03-13 17:57:33 +01:00
Mohamed Mekkouri	4a60bae8e2	Handling an exception related to HQQ quantization in modeling (#36702 ) * adding exception * style * add types	2025-03-13 17:53:36 +01:00
Mehant Kammakomati	09a309d273	fix: fsdp sharded state dict wont work for save_only_model knob (#36627 ) Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-03-13 17:17:35 +01:00
Cyril Vallez	2a004f9ff1	Add loading speed test (#36671 ) * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * trigger CIs * Update test_modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * better error messages * Update test_modeling_utils.py * Update test_modeling_utils.py	2025-03-13 17:07:30 +01:00
Joao Gante	a3201cea14	[CI] Automatic rerun of certain test failures (#36694 )	2025-03-13 15:40:23 +00:00

1 2 3 4 5 ...

18280 Commits