Cyril Vallez
4602059aae
[modular] Fix the prefix-based renaming if the old and new model share a common name suffix ( #37829 )
...
* first try
* Fix and set examples
* style
* fix
* Update modular_test_detr.py
* Update image_processing_new_imgproc_model.py
* Update modular_model_converter.py
2025-04-29 10:43:23 +02:00
Yuanyuan Chen
da4ff2a5f5
Add Optional to remaining types ( #37808 )
...
More Optional typing
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-28 14:20:45 +01:00
Matt
9ec8be56dd
TransfoXL is deprecated, don't keep it in tested examples! ( #37707 )
...
* TransfoXL is deprecated, so we should remove it from examples that get tested
* Remove the tokenizer too
* Trigger tests
2025-04-23 14:59:38 +01:00
Ken J
ca4c114dc4
Add counters for dataset classes ( #37636 )
...
* add counters for dataset classes
* fix failed code style
2025-04-22 17:30:43 +01:00
jeffhataws
964a1b6b7d
Fix ValueError when eval_do_concat_batches=False with examples ( #37621 )
...
https://github.com/huggingface/transformers/issues/37593
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-04-22 12:13:25 +02:00
we1559
b0c6ff5e13
fix issue that some example with no trainer use accelerator.end_train… ( #37435 )
...
* fix issue that some example with no trainer use accelerator.end_training in a wrong way
* reformat code
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-04-18 17:59:42 +02:00
Cyril Vallez
8ab296501a
Remove deprecation warning for num_logits_to_keep ( #37149 )
...
* remove everything
* style
2025-04-14 19:08:45 +02:00
cyyever
0fb8d49e88
Use Python 3.9 syntax in examples ( #37279 )
...
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-07 12:52:21 +01:00
Lysandre
d1b92369ca
v4.52.0.dev0
2025-04-05 22:04:21 +02:00
Jaime Fraustro
afafb84b59
Add support for fast image processing in image-pretraining example ( #37021 )
...
* Add support for fast image processing in image-pretraining example
Fix typo: correct tuple formatting in IMAGE_PROCESSOR_MAPPING_NAMES
Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com >
* Use fast image processor by default
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com >
---------
Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com >
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
2025-04-03 13:26:46 +01:00
cyyever
786d9c5ed9
Fix more inefficient PT operations ( #37060 )
...
* Fix inefficient operations
* Remove cpu() call
* Reorder detach()
* Reorder detach()
* tolist without detach
* item without detach
* Update src/transformers/models/rag/modeling_rag.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
* Update tests/models/encodec/test_modeling_encodec.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
* Use detach().cpu().numpy
* Revert some numpy operations
* More fixes
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
2025-03-31 16:31:24 +01:00
cyyever
f99c279d20
Remove deprecated code ( #37059 )
...
* Remove deprecated code
* fix get_loading_attributes
* fix error
* skip test
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-03-31 11:15:35 +02:00
efsotr
2b4734bd49
Support passing flash_attn_kwargs when gradient_checkpointing is enabled ( #37037 )
...
* support passing flash_attn_kwargs when gradient_checkpointing is enabled
* make modeling_deepspeek_v3.py consistent with modular_deepseek_v3.py
2025-03-31 10:53:02 +02:00
cyyever
41a0e58e5b
Set weights_only in torch.load ( #36991 )
2025-03-27 14:55:50 +00:00
cyyever
2b550c47b2
Remove deprecated training arguments ( #36946 )
...
* Remove deprecated training arguments
* More fixes
* More fixes
* More fixes
2025-03-26 16:44:48 +00:00
Yih-Dar
121830ab47
update examples after ruff being updated ( #36972 )
...
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-03-25 18:15:47 +01:00
湛露先生
ebd2029483
Change GPUS to GPUs ( #36945 )
...
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com >
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
2025-03-25 17:25:39 +01:00
Arthur Zucker
4542b8fb27
push v4.51.0.dev0
2025-03-21 13:45:25 +01:00
Matt
9be4728af8
Just import torch AdamW instead ( #36177 )
...
* Just import torch AdamW instead
* Update docs too
* Make AdamW undocumented
* make fixup
* Add a basic wrapper class
* Add it back to the docs
* Just remove AdamW entirely
* Remove some AdamW references
* Drop AdamW from the public init
* make fix-copies
* Cleanup some references
* make fixup
* Delete lots of transformers.AdamW references
* Remove extra references to adamw_hf
2025-03-19 18:29:40 +00:00
Matt
1e4286fd59
Remove research projects ( #36645 )
...
* Remove research projects
* Add new README to explain where the projects went
* Trigger tests
* Cleanup all references to research_projects
2025-03-11 13:47:38 +00:00
dependabot[bot]
4fce7a0f0f
Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/decision_transformer ( #36582 )
...
Bump jinja2 in /examples/research_projects/decision_transformer
Bumps [jinja2](https://github.com/pallets/jinja ) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases )
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst )
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6 )
---
updated-dependencies:
- dependency-name: jinja2
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-07 13:35:59 +00:00
dependabot[bot]
acc49e390d
Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/pplm ( #36540 )
...
Bump transformers in /examples/research_projects/pplm
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-06 11:35:47 +00:00
co63oc
37508816d6
chore: Fix typos in docs and examples ( #36524 )
...
Fix typos in docs and examples
Signed-off-by: co63oc <co63oc@users.noreply.github.com >
2025-03-04 13:47:41 +00:00
Kashif Rasul
9fe82793ee
[Style] fix E721 warnings ( #36474 )
...
* fix E721 warnings
* config.hidden_size is not a tuple
* fix copies
* fix-copies
* not a tuple
* undo
* undo
2025-03-03 18:03:42 +00:00
Cyril Vallez
da4ab2a1b6
Fix doc formatting in forward passes & modular ( #36243 )
...
* fix indentation issues + modular without magic keyword
* style
* Update doc.py
* style
* Fix all decorators indentation
* all models
* style
* style
* Update doc.py
* fix
* general fix
* style
2025-02-25 11:09:01 +01:00
Cyril Vallez
bc65f3fc1c
[modular] Do not track imports in functions ( #36279 )
...
* Add check
* just check for function
* Update examples
2025-02-25 10:29:47 +01:00
Mohamed Mekkouri
e5cea20743
Add Example for Custom quantization ( #36286 )
...
* add example
* rename
2025-02-19 17:09:23 +01:00
Parteek
8eaae6bee9
Added Support for Custom Quantization ( #35915 )
...
* Added Support for Custom Quantization
* Update code
* code reformatted
* Updated Changes
* Updated Changes
---------
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-02-18 16:14:19 +01:00
dependabot[bot]
3e970dbbf1
Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/codeparrot/examples ( #36237 )
...
Bump transformers in /examples/research_projects/codeparrot/examples
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-17 16:28:43 +00:00
Arthur Zucker
c877c9fa5b
v4.45.0-dev0
2025-02-17 15:21:20 +01:00
dependabot[bot]
c5506f4f00
Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/adversarial ( #36168 )
...
Bump transformers in /examples/research_projects/adversarial
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 15:06:16 +00:00
dependabot[bot]
d7c5d1b539
Bump transformers from 4.38.0 to 4.48.0 in /examples/tensorflow/language-modeling-tpu ( #36167 )
...
Bump transformers in /examples/tensorflow/language-modeling-tpu
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 14:46:38 +00:00
Thomas Bauwens
8f137b2427
Move DataCollatorForMultipleChoice from the docs to the package ( #34763 )
...
* Add implementation for DataCollatorForMultipleChoice based on docs.
* Add DataCollatorForMultipleChoice to import structure.
* Remove custom DataCollatorForMultipleChoice implementations from example scripts.
* Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean.
* Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable.
* Apply suggested changes and run make fixup.
* fix copies, style and fixup
* add missing documentation
* nits
* fix docstring
* style
* nits
* isort
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com >
2025-02-13 12:01:28 +01:00
dependabot[bot]
d52a9d08ce
Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer ( #36142 )
...
Bump cryptography in /examples/research_projects/decision_transformer
Bumps [cryptography](https://github.com/pyca/cryptography ) from 43.0.1 to 44.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 13:34:52 +00:00
dependabot[bot]
31e4831b98
Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip ( #36136 )
...
Bump transformers in /examples/research_projects/vqgan-clip
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 13:21:09 +00:00
Fanli Lin
9b69986e8a
[docs] minor doc fix ( #36127 )
...
fix
2025-02-11 10:31:12 -08:00
Liangliang Ma
315a9f494e
Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files ( #35647 )
...
* add xpu for unmask
* change modular for generated matching
* add lastest modeling for helium
2025-02-05 13:28:31 +01:00
Yoni Gozlan
fa56dcc2ab
Refactoring of ImageProcessorFast ( #35069 )
...
* add init and base image processing functions
* add add_fast_image_processor to transformers-cli
* add working fast image processor clip
* add fast image processor to doc, working tests
* remove "to be implemented" SigLip
* fix unprotected import
* fix unprotected vision import
* update ViTImageProcessorFast
* increase threshold slow fast ewuivalence
* add fast img blip
* add fast class in tests with cli
* improve cli
* add fast image processor convnext
* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision
* add device kwarg to ImagesKwargs for fast processing on cuda
* cleanup
* fix unprotected import
* group images by sizes and add batch processing
* Add batch equivalence tests, skip when center_crop is used
* cleanup
* update init and cli
* fix-copies
* refactor convnext, cleanup base
* fix
* remove patching mixins, add piped torchvision transforms for ViT
* fix unbatched processing
* fix f strings
* protect imports
* change llava onevision to class transforms (test)
* fix convnext
* improve formatting (following Pavel review)
* fix handling device arg
* improve cli
* fix
* fix inits
* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs
* uniformize qwen2_vl fast
* fix docstrings
* add add fast image processor llava
* remove min_pixels max_pixels from accepted size
* nit
* nit
* refactor fast image processors docstrings
* cleanup and remove fast class transforms
* update add fast image processor transformers cli
* cleanup docstring
* uniformize pixtral fast and make _process_image explicit
* fix prepare image structure llava next/onevision
* Use typed kwargs instead of explicit args
* nit fix import Unpack
* clearly separate pops and gets in base preprocess. Use explicit typed kwargs
* make qwen2_vl preprocess arguments hashable
2025-02-04 17:52:31 -05:00
Ryoo Kwangrok
b1954fd64a
layernorm_decay_fix ( #35927 )
...
* layernorm_decay_fix
* W293 fix
* ruff format fix
* black format
* ruff format
* erase last layer
* add test_get_parameter_names_rmsnorm
* rmsnorm fix
2025-02-04 11:01:49 +01:00
Gar
9d2056f12b
Add mean_resizing for every VLMs' resizing_token_embeddings() ( #35717 )
...
* refine all resize_token_embedding()
* ruff format
* hotfix
2025-02-03 15:03:49 +01:00
Sugendran Ganess
14a9bb520e
Fix fast image processor warnings in object detection examples ( #35892 )
...
Have the DETR examples default to using the fast image processor
2025-01-27 08:32:44 +00:00
Joao Gante
90b46e983f
Remove old benchmark code ( #35730 )
...
* remove traces of the old deprecated benchmarks
* also remove old tf benchmark example, which uses deleted code
* run doc builder
2025-01-21 17:56:43 +00:00
Louie Tsai
f82b19cb6f
add a new flax example for Bert model inference ( #34794 )
...
* add a new example for flax inference cases
* Update examples/flax/language-modeling/README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update examples/flax/language-modeling/README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update examples/flax/language-modeling/README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update examples/flax/language-modeling/README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update examples/flax/language-modeling/README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update examples/flax/language-modeling/README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix for "make fixup"
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-01-21 14:09:29 +01:00
Cyril Vallez
91be6a5eb2
Modular: support for importing functions from any file ( #35692 )
...
* fix function imports
* improve comment
* Update modeling_switch_function.py
* make checks more robust
* improvement
* rename
* final test update
2025-01-16 16:37:53 +00:00
Raushan Turganbay
09d5f76274
Clean-up composite configs ( #34603 )
...
* remove manual assignment tie-word-embeddings
* remove another unused attribute
* fix tests
* fix tests
* remove unnecessary overwrites
* fix
* decoder=True
* clean pix2struct
* run-all
* forgot `_tied_weights_keys` when adding Emu3
* also Aria + fix-copies
* and clean aria
2025-01-15 10:04:07 +01:00
Arthur Zucker
f63829c87b
v4.49.0-dev
2025-01-10 12:31:11 +01:00
Cyril Vallez
46276f9a7f
Fix modular edge case + modular sorting order ( #35562 )
...
* look-ahead negation
* re add examples by default
* Fix the bug in topological sort
* Update create_dependency_mapping.py
* start adding test
* finalize test
* more tests
* style
* style
2025-01-09 17:17:52 +01:00
Kevin R
665a4942e4
Check whether rescale is requested before checking is_scaled_image ( #35439 )
2025-01-07 11:39:45 +00:00
dependabot[bot]
86fa3cedad
Bump jinja2 from 3.1.4 to 3.1.5 in /examples/research_projects/decision_transformer ( #35408 )
...
Bump jinja2 in /examples/research_projects/decision_transformer
Bumps [jinja2](https://github.com/pallets/jinja ) from 3.1.4 to 3.1.5.
- [Release notes](https://github.com/pallets/jinja/releases )
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst )
- [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5 )
---
updated-dependencies:
- dependency-name: jinja2
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-06 16:58:29 +00:00
Arthur
2c47618c1a
🚨 All attention refactor 🚨 ( #35235 )
...
* refactor LlamaAttention
* minimal changes
* fix llama
* update
* modular gemmas
* modular nits
* modular updates
* nits
* simplify
* gpt2
* more modualr and fixes
* granite
* modular modular modular
* nits
* update
* qwen2 + starcoder2
* mostly gemma2
* Update image_processing_auto.py
* fix
* Update modular_starcoder2.py
* fix
* remove all copied from attentions
* remove gcv
* make fix-copies
* oups
* oups2.0
* fix some modulars + all copied from
* should be good now
* revert unwanted changes
* Update modeling_decision_transformer.py
* finish cleanup
* Update modeling_olmo.py
* consistency
* re-add gradient checkpointing attribute
* fix
* style
* make config necessary
* bis
* bis
* Update modeling_my_new_model2.py
* is_causal attr
* fix
* remove past kv return from decoder layer
* fix
* default rope config
* correctly fix rope config
* fix bias
* fix gpt2 attention output
* fix test
* fix inits
* fix default sdpa
* fix default sdpa implementation
* harmonize classes
* fix mistral
* fix sliding window models
* mixtral
* be more explicit
* style
* fix
* several fixes
* Update modeling_dbrx.py
* fix test
* olmo + phi
* rotary
* syle
* phi
* phi again
* again
* kwargs
* Update test_modeling_common.py
* skip fx tracing tests
* Update modeling_utils.py
* gemma 2
* again
* Update modeling_recurrent_gemma.py
* gemma2
* granite
* style
* starcoder
* Update sdpa_attention.py
* switch args
* Update modeling_mllama.py
* fix
* cache type tests
* gpt2
* Update test_modeling_common.py
* fix
* consistency
* fix shape with encoder
* should be the last one
* tests non model
* most comments
* small oupsi
* be more explicit in modulars
* more explicit modulars
* CIs! it works locally
* add kwargs to _flash_attention_forward
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
2024-12-18 16:53:39 +01:00