Driss Guessous
e8d960329e
Add option for ao base configs ( #36526 )
2025-03-19 14:59:47 +01:00
Arthur
fef8b7f8e9
Add attention visualization tool ( #36630 )
...
* add utils fiel
* style
* nits
* nits
* update
* updaets
* update
* fix init issues
* big updates
* nits
* nits?
* small updates
* nites
* there were still some models left
* style
* fixes
* updates
* nits _ fixes
* push changes
* update
* update
* update
* Apply suggestions from code review
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
* style
* styling and return a string for testing
* small updates
* always biderectional for now
* update
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
2025-03-19 13:58:46 +01:00
Joao Gante
0fe0bae0a8
[Generation] remove leftover code from end-to-end compilation ( #36685 )
2025-03-19 11:28:33 +00:00
Mohamed Mekkouri
a861db01e5
Fix Device map for bitsandbytes tests ( #36800 )
...
fix
2025-03-19 11:57:13 +01:00
Yih-Dar
b9374a0763
Remove dist": "loadfile" for pytest in CircleCI jobs ( #36811 )
...
* fasterrrrr
* avoid crash in example jobs
* avoid crash in TF example jobs
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-03-19 11:15:09 +01:00
Yao Matrix
4fa91b1be5
fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model ( #36572 )
...
* fix "Cannot copy out of meta tensor; no data!" issue for BartForConditionalGeneration model
* follow Marc's suggestion to use _tie_weights to fix
Signed-off-by: Yao, Matrix <matrix.yao@intel.com >
* fix review comments.
Signed-off-by: N <matrix.yao@intel.com >
* fix quality
Signed-off-by: N <matrix.yao@intel.com >
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com >
Signed-off-by: N <matrix.yao@intel.com >
2025-03-19 10:48:47 +01:00
ivarflakstad
706703bba6
Expectations test utils ( #36569 )
...
* Add expectation classes + tests
* Use typing Union instead of |
* Use bits to track score in properties cmp method
* Add exceptions and tests + comments
* Remove compute cap minor as it is not needed currently
* Simplify. Remove Properties class
* Add example Exceptions usage
* Expectations as dict subclass
* Update example Exceptions usage
* Refactor. Improve type name. Document score fn.
* Rename to DeviceProperties.
2025-03-18 23:39:50 +01:00
Joao Gante
179d02ffb8
[generate] ✨ vectorized beam search ✨ ( #35802 )
2025-03-18 18:39:36 +00:00
Yoni Gozlan
12f2ebef63
Support custom dosctrings in modular ( #36726 )
...
* Override docstrings in modular if not none
* Update doc
2025-03-18 14:00:54 -04:00
Gar
00915d3041
Fix chameleon's TypeError because inputs_embeds may None ( #36673 )
...
* fix chameleon TypeError when inputs_embeds is None
* reformat
* hotfix
2025-03-18 18:59:30 +01:00
Marc Sun
14b597f518
Fix casting dtype for qunatization ( #36799 )
...
* fix
* remove print
2025-03-18 18:46:03 +01:00
Yoni Gozlan
30580f035b
Fix Mistral3 tests ( #36797 )
...
* fix processor tests
* fix modeling tests
* fix test processor chat template
* revert modeling test changes
2025-03-18 13:08:12 -04:00
Cyril Vallez
db1d4c5a0b
Loading optimizations ( #36742 )
...
* improvements
* Update modeling_utils.py
* add some doc about loading
* Update modeling_utils.py
2025-03-18 16:38:44 +01:00
Yih-Dar
7baf00089a
Update SHA for tj-actions/changed-files ( #36795 )
...
* trigger
* trigger
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-03-18 16:19:39 +01:00
Marc Sun
3017536ebf
fix hqq due to recent modeling changes ( #36771 )
...
* fix-hqq
* style
* test
2025-03-18 12:20:27 +01:00
Cyril Vallez
e959530b8f
Add Mistral3 ( #36790 )
...
Release - Conda / build_and_package (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
* initial start
* style and dummies
* Create convert_mistral3_weights_to_hf.py
* update
* typo
* typo
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* up
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* update
* update
* Update image_processing_mistral3.py
* Update convert_mistral3_weights_to_hf.py
* fix patch merger
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* up
* update modular to fit
* style
* Update convert_mistral3_weights_to_hf.py
* typo
* Update modular_mistral3.py
* simplify a lot all shape shenanigans
* simplify
* add working test processor
* Add partially working common modeling tests
* All tests working and remove mistral3 image processors
* add docs and fixup
* fix inference with image size >1540
* 🚨 fix test image proc pixtral
* Remove vision_feature_select_strategy
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* clean
* fix test checkpoints
* Update test_modeling_mistral3.py
* Update test_modeling_mistral3.py
* style
* Use Pixtral processor
* up
* finish cleaning processor to use pixtral directly
* Update __init__.py
* Update processing_pixtral.py
* doc
* Update __init__.py
* Update mistral3.md
* Update _toctree.yml
---------
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co >
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com >
v4.49.0-Mistral-3
2025-03-18 12:04:42 +01:00
Lysandre Debut
bd92073692
Fix gemma3_text tokenizer in mapping ( #36793 )
2025-03-18 11:50:22 +01:00
Zebin
7426d02ea8
Fixing typo in gemma3 image_processor_fast and adding a small test ( #36776 )
...
Co-authored-by: zebz13 <zeb@fedora>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
2025-03-18 11:35:06 +01:00
Afanti
19b9d8ae13
chore: fix typos in tests directory ( #36785 )
...
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
2025-03-18 10:31:13 +01:00
Afanti
7f5077e536
fix typos in the tests directory ( #36717 )
2025-03-17 17:45:57 +00:00
Daniel Kleine
cbfb8d7b27
doc: Clarify is_decoder usage in PretrainedConfig documentation ( #36724 )
...
* fix: clarify decoder usage in PretrainedConfig documentation
* Apply suggestions from code review
updated doc
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-03-17 09:40:25 -07:00
Steven Liu
ac1a1b66b9
[docs] Update README ( #36265 )
...
* update
* feedback
* feedback
* update versions
2025-03-17 09:37:19 -07:00
Joao Gante
cff4caa0c1
[CI] remove redundant checks in test_eager_matches_sdpa_inference ( #36740 )
2025-03-17 16:29:18 +00:00
Christopher Akiki
e3af4fec91
[MINOR:TYPO] Update hubert.md ( #36733 )
...
* [MINOR:TYPO] Update hubert.md
- typo fix (wave2vec instead of hubert)
- make code snippet copiable and runnable
* Run tests
2025-03-17 09:07:51 -07:00
Petr Kuderov
c8a2b25f91
Fix TrainingArguments.torch_empty_cache_steps post_init check ( #36734 )
...
Mistaken use of De Morgan's law. Fixed "not (X or Y)"
to correct "not (X and Y)" check to raise a ValueError.
Added corresponding test to check "positive int or None" condition.
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
2025-03-17 16:09:46 +01:00
Sambhav Dixit
8e67230860
Fix test isolation for clear_import_cache utility ( #36345 )
...
* test fixup
* test fixup
* fixing tests for unused imports
* style fixes
* fix
* style fixes
* styke fix
* remove isolated module cache
* rm custom subprocess defination
* run using exsiting fn
* style fixup
* make fixup
* remove redundant comments
* rm redundat skipif + style changes
2025-03-17 16:09:09 +01:00
jiqing-feng
27361bd218
fix xpu tests ( #36656 )
...
* fix awq xpu tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix llava next video bnb tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-03-17 15:57:49 +01:00
Fredrik Norén
da7d64f4ff
Allow ray datasets to be used with trainer ( #36699 )
...
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-03-17 15:44:47 +01:00
jiqing-feng
2256875a77
fix can_generate ( #36570 )
...
* fix can_generate
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix can generate for speecht5 and blip
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix speecht5 tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com >
2025-03-17 14:56:18 +01:00
Marc Sun
9e94801146
enable/disable compile for quants methods ( #36519 )
...
* disable compile for most quants methods
* fix
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
* Update tests/quantization/bnb/test_mixed_int8.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
* changes from joao suggestions
---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
2025-03-17 11:38:21 +01:00
Armaghan Shakir
c53d53da89
🚨 🚨 🚨 Fix sdpa in SAM and refactor relative position embeddings ( #36422 )
...
* fall back to eager if output_attentions
* improve relative position embeddings
* run modular on got_ocr2
* run-slow: sam
* fix run-length encoding
* fix tf processor errors
* update tf_sam
* fix compile error
* re-run tests
2025-03-17 09:39:52 +00:00
Joao Gante
fc8764c9a6
[Generation, Gemma 3] When passing a custom generation_config, overwrite default values with the model's base generation_config ( #36684 )
2025-03-15 12:40:09 +00:00
Guillaume LEGENDRE
f263e88dcf
Update self-push-caller.yml
2025-03-15 11:32:04 +01:00
Ilyas Moutawwakil
6f3e0b68e0
Fix grad accum arbitrary value ( #36691 )
2025-03-14 22:03:01 +01:00
Cyril Vallez
2c2495cc7b
Fix post_init() code duplication ( #36727 )
...
* Update modeling_utils.py
* CIs
2025-03-14 17:36:02 +01:00
MaCAT
25992b493c
🌐 [i18n-KO] Translated codegen.md to Korean ( #36698 )
...
* Initial translation
* Add _toctree.yml
2025-03-14 09:31:18 -07:00
Joao Gante
42ebb6c23e
[tests] Parameterized test_eager_matches_sdpa_inference ( #36650 )
2025-03-14 14:41:27 +00:00
Matt
9215cc62d4
Try working around the processor registration bugs ( #36184 )
...
* Try working around the processor registration bugs
* oops
* Update error message
* Clarify error
* Docstring docstring docstring
* The extra content is indexed by config class, so let's grab some values out of there
* Commit my confusion as a TODO
* Resolve my confusion
* Cleanup and mostly revert to the original
* Better autoclass fallback
* Don't nest f-strings you lunatic
* Clearer error message
* Less getattr()
* Revert a lot of changes to try a different approach!
* Try the global registry
* Check the dynamic list as well as the transformers root
* Move the dynamic list somewhere safer
* Move the dynamic list somewhere even safer
* More import cleanup
* Simplify all the register_for_auto_class methods
* Set _auto_class in the register() methods
* Stop setting the cls attribute in register()
* Restore specifying the model class for Model derivatives only
* Fix accidentally taking the .__class__ of a class
* Revert register_for_auto_class changes
* Fix get_possibly_dynamic_module
* No more ALL_CUSTOM_CLASSES
* Fix up get_possibly_dynamic_module as well
* Revert unnecessary formatting changes
* Trigger tests
2025-03-14 13:56:21 +00:00
Sean (Seok-Won) Yi
691d1b52c3
Fix/best model checkpoint fix ( #35885 )
...
* Set best_model_checkpoint only when ckpt exists.
Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists.
* Added best_global_step to TrainerState.
* Added tests for best_model_checkpoint.
* Fixed hard-coded values in test to prevent fail.
* Added helper func and removed hard-coded best_step.
* Added side effect patch generator for _eval.
* Added evaluate side effect func.
* Removed erroneous patching.
* Fixed minor bug.
* Applied Ruff.
* Fixed Ruff problem in make style.
* Used Trainer.set_initial_training_values.
2025-03-14 14:24:53 +01:00
Joao Gante
3bd1a0ddf1
[model loading] don't gc.collect() if only 1 shard is used ( #36721 )
...
* don't gc collect if 1 shard is used
* delete state dict anyways
2025-03-14 12:56:56 +00:00
Matt
8cb522b419
Cleanup the regex used for doc preprocessing ( #36648 )
...
* Cleanup the regex used for doc preprocessing
* Run tests
2025-03-14 12:18:49 +00:00
Matt
72861e11eb
Make the flaky list a little more general ( #36704 )
...
* Make the flaky list a little more general
* Trigger tests
* Make the flaky list a little more general
2025-03-14 12:15:32 +00:00
Kingsley
53742b11f5
Gemma3 processor typo ( #36710 )
...
* fix typo when is on
* tiny
* add test and remove 'text_crops'
* lint
2025-03-14 13:07:55 +01:00
Yoni Gozlan
69bc848480
Add support for fast image processors in add-new-model-like CLI ( #36313 )
...
* add support for fast image processors in add-new-model-like
* fix header not found add-fast-image-processor-cli
* Encourage adding fast image processor
* nit
* start improve doc
* update docs
* make requested modifs
2025-03-13 14:16:37 -04:00
Matt
48ef468c74
Final CI cleanup ( #36703 )
...
* make fixup
* make fixup
* Correct skip decorator
* Add TODOs
* add is_flaky() parentheses
2025-03-13 17:26:09 +00:00
Isotr0py
b070025aa6
Add GGUF support to T5-Encoder ( #36700 )
...
* add gguf support to t5encoder
Signed-off-by: Isotr0py <2037008807@qq.com >
* fix
Signed-off-by: Isotr0py <2037008807@qq.com >
* remove gguf from model_kwargs
Signed-off-by: Isotr0py <2037008807@qq.com >
---------
Signed-off-by: Isotr0py <2037008807@qq.com >
2025-03-13 17:57:33 +01:00
Mohamed Mekkouri
4a60bae8e2
Handling an exception related to HQQ quantization in modeling ( #36702 )
...
* adding exception
* style
* add types
2025-03-13 17:53:36 +01:00
Mehant Kammakomati
09a309d273
fix: fsdp sharded state dict wont work for save_only_model knob ( #36627 )
...
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-03-13 17:17:35 +01:00
Cyril Vallez
2a004f9ff1
Add loading speed test ( #36671 )
...
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* trigger CIs
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* Update test_modeling_utils.py
* better error messages
* Update test_modeling_utils.py
* Update test_modeling_utils.py
2025-03-13 17:07:30 +01:00
Joao Gante
a3201cea14
[CI] Automatic rerun of certain test failures ( #36694 )
2025-03-13 15:40:23 +00:00