Commit Graph

2120 Commits

Author SHA1 Message Date
Yoni Gozlan
7b4d9843ba Add fast image processor Janus, Deepseek VL, Deepseek VL hybrid (#39739)
* add fast image processor Janus, deepseek_vl, deepseek_vl_hybrid

* fix after review
2025-08-01 12:20:08 -04:00
rziga
3951d4ad5d Add MM Grounding DINO (#37925)
* first commit

Added modular implementation for MM Grounding DINO from starting point created by add-new-model-like. Added conversion script from mmdetection to huggingface.

TODO: Some tests are failing so that needs to be fixed.

* fixed a bug with modular definition of MMGroundingDinoForObjectDetection where box and class heads were not correctly assigned to inner model

* cleaned up a hack in the conversion script

* Fixed the expected values in integration tests

Cross att masking and cpu-gpu consistency tests are still failing however.

* changes for make style and quality

* add documentation

* clean up contrastive embedding

* add mm grounding dino to loss mapping

* add model link to config docstring

* hack fix for mm grounding dino consistency tests

* add special cases for unused config attr check

* add all models and update docs

* update model doc to the new style

* Use super_kwargs for modular config

* Move init to the _init_weights function

* Add copied from for tests

* fixup

* update typehints

* Fix-copies for tests

* fix-copies

* Fix init test

* fix snippets in docs

* fix consistency

* fix consistency

* update conversion script

* fix nits in readme and remove old comments from conversion script

* add license

* remove unused config args

* remove unnecessary if/else in model init

* fix quality

* Update references

* fix test

* fixup

---------

Co-authored-by: qubvel <qubvel@gmail.com>
2025-08-01 15:43:23 +01:00
Arthur
c962f1515e [attn_implementation] remove recursive, allows custom kernels with wrappers (#39823)
* fix?

* fixme and style

* Update src/transformers/modeling_utils.py

* update

* update

* fix

* small fixees

* nit

* nits

* fix init check?

* fix

* fix default

* or fucks me

* nits

* include a small nit

* does this make it hapy?

* fixup

* fix the remaining ones
2025-08-01 12:18:28 +02:00
Raushan Turganbay
d3b8627b56 [VLMs] split out "get placeholder mask" to helper (#39777)
* batch upidate all models

* update

* forgot about llava onevision

* update

* fix tests

* delete file

* typo

* fix emu3 once and forever

* update cohere2 vision as well
2025-08-01 08:01:06 +00:00
Raushan Turganbay
e1688d28d3 [Model] Cohere2 Vision (#39810)
* Add cohere2_vision to support CohereLabs/command-a-vision-07-2025

* update and add modualr file

* update processors and check with orig impl later

* delete unused files

* image processor reduce LOC and re-use GotOCR2

* update the config to use modular

* model tests pass

* processor fixes

* check model outputs decorator

* address one more comment

* Update tokens. Temp - need to read from tokenizer'

* fix for multi-gpu

* Fix image token handling

* upadte image token expansion logic

* fix a few issues with remote code loading

* not related but modular forces us to change all files now

* Add overview and code sample to cohere vision docs

* add scripts. TMP.

* Update inference script

* Create script

* set dtype in export script

* TO revert: modular export fix

* Fix scripts

* Revert "TO revert: modular export fix"

This reverts commit bdb2f305b61027a05f0032ce70d6ca698879191c.

* Use modular weights

* Upload to hub

Removed OOD weights ad script

* Updated docs

* fix import error

Update docs

Added pipeline test

* Updated docs

* Run modular script

remove modular for config

Added patch_size

Added docstrings in modular

Fix OOM

Add docs, fixup integration tests. 8-gpu passing

* tiny updates

* address comments + fixup

* add test for chat template

* check model outputs workaround

* aya vision fix check model inputs

* Revert "add test for chat template"

This reverts commit 42c756e397f588d76b449ff1f93292d8ee0202d8.

* reveert more changes

* last revert

* skip and merge

* faulty copy from

---------

Co-authored-by: Julian Mack <julian.mack@cohere.com>
Co-authored-by: kyle-cohere <kyle@cohere.com>
2025-07-31 10:57:34 +00:00
Joao Gante
4f93cc9174 fix: providing a tensor to cache_position in model.generate kwargs always crashes because of boolean test (#39300)
* fix: cache_position: RuntimeError: Boolean value of Tensor with more than one value is ambiguous

* test cache_position

* move test

* propagate changes

---------

Co-authored-by: Masataro Asai <guicho2.71828@gmail.com>
2025-07-30 17:30:28 +00:00
Yuanyuan Chen
1e0665a191 Simplify conditional code (#39781)
* Use !=

Signed-off-by: cyy <cyyever@outlook.com>

* Use get

Signed-off-by: cyy <cyyever@outlook.com>

* Format

* Simplify bool operations

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-07-30 12:32:10 +00:00
Cyril Vallez
67cfe11528 Fix Evolla and xLSTM tests (#39769)
* fix all evolla

* xlstm
2025-07-30 09:51:55 +02:00
Cyril Vallez
ddd2100767 Fix OmDet test after arg deprecation (#39766)
fix arg name
2025-07-29 22:10:36 +02:00
Manuel de Prada Corral
c4e2069898 Fix Cache.max_cache_len max value for Hybrid models (#39737)
* fix gemma

* fix min

* fix quant init issue

* fix gemma 3n

* skip quant cache test

* fix modular

* new test for Gemma

* include cyril change

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-07-29 17:12:50 +02:00
Raushan Turganbay
1ad216bd7d [modenbert] fix regression (#39750)
* fix regression

* add FA2 test
2025-07-29 16:58:59 +02:00
Yih-Dar
dfd616e658 Avoid OOM when other tests are failing (#39758)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-29 15:35:44 +02:00
Yuanyuan Chen
95faabf0a6 Apply several ruff SIM rules (#37283)
* Apply ruff SIM118 fix

Signed-off-by: cyy <cyyever@outlook.com>

* Apply ruff SIM910 fix

Signed-off-by: cyy <cyyever@outlook.com>

* Apply ruff SIM101 fix

Signed-off-by: cyy <cyyever@outlook.com>

* Format code

Signed-off-by: cyy <cyyever@outlook.com>

* More fixes

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-07-29 11:40:34 +00:00
Yih-Dar
de8d0cec30 update GemmaIntegrationTest::test_model_2b_bf16_dola again (#39731)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-29 11:42:55 +02:00
Raushan Turganbay
75794792ad BLIPs clean-up (#35560)
* blips clean up

* update processor

* readability

* fix processor length

* fix copies

* tmp

* update and fix copies

* why keep these, delete?

* fix test fetcher

* irrelevant comment

* fix tests

* fix tests

* fix copies
2025-07-29 10:03:06 +02:00
Ramesh
4f8f51be4e Add Fast Segformer Processor (#37024)
* Add Fast Segformer Processor

* Modified the params according to segformer model

* modified test_image_processing_Segformer_fast args

- removed redundant params like do_center_crop,center_crop which aren't present in the original segformer class

* added segmentation_maps processing logic form the slow segformer processing module with references from beitimageprocessing fast

* fixed code_quality

* added recommended fixes and tests to make sure everything processess smoothly

* Fixed SegmentationMapsLogic

- modified the preprocessing of segmentation maps to use tensors
- added batch support

* fixed some mismatched files

* modified the tolerance for tests

* use modular

* fix ci

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2025-07-28 19:22:32 +00:00
Avigyan Sinha
c353f2bb5e Superpoint fast image processor (#37804)
* feat: superpoint fast image processor

* fix: reran fast cli command to generate fast config

* feat: updated test cases

* fix: removed old model add

* fix: format fix

* Update src/transformers/models/superpoint/image_processing_superpoint_fast.py

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>

* fix: ported to torch and made requested changes

* fix: removed changes to init

* fix: init fix

* fix: init format fix

* fixed testcases and ported to torch

* fix: format fixes

* failed
test case fix

* fix superpoint fast

* fix docstring

---------

Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2025-07-28 18:15:06 +00:00
Raushan Turganbay
1c6b47451d Fix cache-related tests (#39676)
* fix

* fix kyutai at last

* fix unrelated tests and copies

* update musicgen as well

* revert tensor

* fix old test failures

* why it wasn't added?
2025-07-28 17:30:11 +02:00
Eric Bezzam
7623aa3e5f Fix Qwen2AudioForConditionalGeneration.forward() and test_flash_attn_kernels_inference_equivalence (#39503)
* Add missing cache_position argument.

* Pass cache_position to language model.

* Overwrite prepare_inputs_for_generation.

* Set model to half precision for Flash Attention test.

* Cast model to bfloat16.
2025-07-28 16:35:08 +02:00
Yih-Dar
28f2619868 skip Glm4MoeModelTest::test_torch_compile_for_training (#39670)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-28 16:30:40 +02:00
Cyril Vallez
686bb3b098 Remove all expired deprecation cycles (#39725)
* remove all deprecation cycles

* style

* fix

* remove

* remove

* fix

* Update modular_dpt.py

* back

* typo

* typo

* final fix

* remove all args
2025-07-28 15:43:41 +02:00
Raushan Turganbay
b56d721397 [configuration] remove redundant classmethod (#38812)
* remove redundant classmethod

* warning message, add space between words

* fix tests

* fix copies
2025-07-28 10:38:48 +00:00
Raushan Turganbay
8b237b8639 [processors] add tests for helper fn (#39629)
* add tests for helpers

* duplicate test for each model

* why llava next video has no helper

* oops must have been in the commit

* fix test after rebase

* add copy from
2025-07-28 09:41:58 +00:00
BUI Van Tuan
6a61e16626 Fix missing initialization of FastSpeech2Conformer (#39689)
* fix missing initialization of FastSpeech2Conformer

* switch order and reactivate tests

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-07-28 10:47:39 +02:00
Cyril Vallez
18a7c29ff8 More robust tied weight test (#39681)
* Update test_modeling_common.py

* remove old ones

* Update test_modeling_common.py

* Update test_modeling_common.py

* add

* Update test_modeling_musicgen_melody.py
2025-07-25 22:03:21 +02:00
Garrett Goon
97f8c71f52 Add padding-free to Granite hybrid moe models (#39677)
* start fixing kwarg handling

* fmt

* updates padding free tests

* docs

* add missing kwargs modeling_granitemoe.py

* run modular util

* rm unrelated changes from modular util
2025-07-25 20:10:50 +02:00
lgai-exaone
c06d4cd6ce Add EXAONE 4.0 model (#39129)
* Add EXAONE 4.0 model

* Refactor EXAONE 4.0 modeling code

* Fix cache slicing on SWA + FA2

* Fix cache slicing on FA2 + HybridCache

* Update EXAONE 4.0 modeling code for main branch

* Update o_proj for asymmetric projection

* Address PR feedback

* Add EXAONE 4.0 docs

* Update EXAONE 4.0 modeling code for main branch

* update

* fix updates

* updates

* fix

* fix

* fix

---------

Co-authored-by: Arthur <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-07-25 19:58:28 +02:00
Cyril Vallez
6630c5b714 Add xlstm model (#39665)
* Add xLSTM cleanly with optimizations.

* Fix style.

* Fix modeling test.

* Make xLSTM package optional.

* Fix: Update torch version check.

* Fix: Bad variable naming in test.

* Fix: Import structure cleaning with Ruff.

* Fix: Update docstrings.

* Fix: Mitigate unused config attr tests by explicit usage.

* Fix: Skip tests, if xlstm library is not installed.

* Feat: Enable longer context window for inference by chunking.

* Fix: Make training test pass by lowering target accuracy.

* Chore: Increase test verbosity for failing generation test.

* Update docs/source/en/model_doc/xlstm.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Fix: Make xlstm available even without CUDA.

* Chore: Remove unnecessary import.

* Fix: Remove BOS insertion.

* Chore: Improve xLSTMCache documentation.

* Integrate basic xLSTM fallback code.

* Chore: Remove unnecessary import.

* Chore: Remove duplicate LayerNorm.

* chore: update copyright, minor reformatting

* fix: refactor mLSTMStateType due to missing torch import

* fix: add missing import

* Chore: Replace einops.

* fix: apply ruff formatting

* fix: run `make fix-copies` to re-generate dummy_pt_objects.py

* fix: make type hints Python 3.9 compatible

* fix: remove obsolete import

* fix: remove obsolete method from docs

* chore: remove obsolete `force_bos_token_insert` from config

* Chore: Remove duplicated xLSTMCache class.

* Fix: Formatting of modeling_xlstm.py

* Chore: Remove xlstm package requirement from test. Re-add update_rnn_state.

* Fix: Update xLSTMCache docstring.

* Feat: Add proper initialization of xLSTM.

* Chore: Re-format files.

* Chore: Adapt format.

* Fix: xLSTMCache import restructuring.

* Fix: Add __all__ lists to modeling and configuration files.

* Chore: Reformat.

* Fix: Remove unnecessary update_rnn_state function.

* Fix: Undo test accuracy quickfix.

* Fix: Update copyright year, remvoe config copy.

* Chore: Flatten all internal configs to xLSTMConfig.

* Fix: Unused config variables check.

* Chore: Remove unnecessary imports.

* Fix: Unify xlstm cache argument from batch_size to max_batch_size.

* Chore: Remove bad default arg value for xLSTMCache.

* Chore: Rename core configuration arguments to HF default in xLSTM.

* Chore: Fix formatting.

* Fix: xLSTM Cache config access.

* Fix: Update xlstm tests for config update.

* Feat: Re-add embbeding_dim, num_blocks config options for compat with xLSTM-7B.

* Fix: Configuration xLSTM python3.9 syntax.

* Fix: Difference to main in test_utils.py assertion.

* Fix: Bad syntax in xlstm config for python3.9.

* Fix: xLSTMConfig docstring.

* Fix: xLSTMConfig docstring.

* Fix typing issues in xLSTM and BeiT, Paligemma.

* Fix: Exclude xLSTM from test cache utils.

* Chore: Fix style.

* Chore: Fix format.

* Chore: Remove unnecessary LayerNorm, NormLayer layer abstractions.

* Chore: Remove asserts and replace with ValueErrors.

* Chore: Update __init__.py structure of xLSTM.

* Chore: Clean xLSTM initialization of weights.

* Fix index names in modeling_xlstm.py

* Update xlstm model test typing annotations.

* Fix: Remove all asserts.

* Revert changes to the main __init__.py

* Fix: Move xLSTMCache to modeling_xlstm.py

* Fix: Remove xLSTMForCausalLM mapping from modeling_auto.py

* Remove xLSTMCache from dummy_pt_objects.py

* Fix: Remove extended torchdynamo compilation check integrating cuda graph captures.

* Revert test_cache_utils.py xLSTM change.

* Fix: Move xLSTM init functions before init call.

* Remove xLSTMCache from generation utils.

* Fix: Clean xLSTM init functionality for recursive calls.

* Fix: Move xLSTMCache before its first call.

* Fix formatting.

* Add partial docstring for xLSTMModel forward.

* Fix xLSTMCache docstring in xLSTMModel.

* Remove xLSTMCache from public documentation. Update auto_docstring.

* Remove all agressive shape comments

* style

* Fix names

* simplify

* remove output_hidden_states

* Update modeling_xlstm.py

* Update modeling_xlstm.py

* Update test_modeling_xlstm.py

* Update modeling_xlstm.py

* Update modeling_xlstm.py

* fix

* fix

* style

* style

---------

Co-authored-by: Korbinian Poeppel <korbinian.poeppel@nx-ai.com>
Co-authored-by: Korbinian Pöppel <37810656+kpoeppel@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sebastian Böck <sebastian.boeck@nx-ai.com>
Co-authored-by: Korbinian Poeppel <poeppel@ml.jku.at>
2025-07-25 19:39:17 +02:00
Armaghan Shakir
69cff312f5 Add support for DeepseekAI's DeepseekVL (#36248)
* upload initial code

* update deepseek-vl adaptor

* update hierarchy of vision model classes

* udpate aligner model

* add text model

* Added Image Processor

* Added Image Processor

* Added Image Processor

* apply masks

* remove projection; add aligner

* remove interpolate_pos_encoding

* remove unused params in config

* cleaning

* Add the __init__ file

* added processing deepseek_vl class

* modified the deepseek-vl processor

* modified the deepseek-vl processor

* update __init__

* Update the image processor class name

* Added Deepseek to src/transformers/__init__.py file

* Added Deepseek to image_processing_auto.py

* update the __init__ file

* update deepseek_vl image processor

* Update Deepseek Processor

* upload fast image processor

* Revert "upload fast image processor"

This reverts commit 68c8fd50bafbb9770ac70c9de02448e2519219b4.

* update image processor

* flatten heirarchy

* remove DeepseekVLModel

* major update (complete modeling)

* auto modeling and other files

* formatting

* fix quality

* replace torchvision in modeling

* set default do_normalize to False

* add fast image processor template using tool

* update image processors

* add fast image processor to other files

* update liscense

* Added deepseek image testcases

* update image test

* update processor

* write CHAT_TEMPLATE

* update model for processor

* fix processor

* minor fixes and formatting

* fix image processing and tests

* fix interpolation in sam

* fix output_attentions in DeepseekVLModel

* upload test_modeling

* fix tests because of vocab size

* set use_high_res_vision=False in tests

* fix all modeling tests

* fix styling

* remove explicit background_color from image processors

* added test_processor

* added test_processor

* fix processor tests

* update docs

* update docs

* update docs

* update conversion script

* Fixed typos

* minor fixes from review

- remove model_id comments in examples
- remove from pre-trained auto mapping
- move to image-text-to-text from vision-to-seq in auto mapping
- add image_token_index to __init__ for config
- remove outdated temporary config in conversion script
- update example to use chat_template in docstring example
- update liscense 2021->2025

* fix type in config docstring

Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>

* update get_image_features

* fix config

* improve DeepseekVLImageProcessor.preprocess

* return image_hidden_states

* use AutoTokenizer and AutoImageProcessor in Processor

* fix model outputs

* make num_image_tokens configurable

* fix docstring of processor

* move system prompt to chat template

* fix repo consistency

* fix return_dict

* replace SamVisionEncoder with SamVisionModel

* update to remove deepcopy

* 🛠️  Major Architectural Changes (Adds DeepseekVLHybrid)

* fix quality checks

* add missing hybrid in auto modeling

* run make style

* update sam_hq

* update high_res_size in test

* update docs following #36979

* update code with auto_docstring

* update conversion scripts

* fix style

* fix failing test because of tuple

* set weights_only=True in conversion script

* use safetensors.torch.load_file instead of torch.load in conversion script

* make output_dir optional in conversion script

* fix code snippets in docs (now the examples work fine)

* integration tests for DeepseekVL

* update expected texts

* make style

* integration tests for DeepseekVLHybrid

* fix class name

* update expected texts for hybrid

* run "make style"

* update since changes in main

* run make-style

* nits since changes in main

* undo changes in sam

* fix tests

* fix tests; update with main

* update with main: output_attention/output_hidden_states

* fix copied part in deepseek_vl

* run fix-copies

* fix output_hidden_states

* sam: fix _init_weigths

* use modular for DeepseekVL

* make image processor more modular

* modular: use JanusPreTrainedModel

* janus: provide kwargs in loss

* update processors in conversion script

* Revert "sam: fix _init_weigths"

This reverts commit db625d0c68956c0dad45edd7a469b6a074905c27.

* run fix-copies

---------

Co-authored-by: Shakib-IO <shakib.khan17@northsouth.edu>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
2025-07-25 19:18:50 +02:00
Xibin Bayes Zhou
45c7bfb157 Add evolla rebase main (#36232)
* add evolla

* adding protein encoder part

* add initial processing test

* save processor

* add docstring

* add evolla processor

* add two test

* change vision to protein

* change resampler to sequence_compressor

* change vision to protein

* initial update for llama

* add initial update for llamaForCausalLM

* add `test_processor`, `test_saprot_output`, `test_protein_encoder_output`

* change evolla, but still working on it

* add test_single_forward

* pass test_attention_outputs

* pass test_hidden_states_output

* pass test_save_load and test_from_pretrained_no_checkpoint

* pass test_cpu_offload

* skip some tests

* update new progress

* skip test_model_is_small

* pass test_model_weights_reload_no_missing_tied_weights

* pass test_model_get_set_embeddings

* pass test_cpu_offload

* skip test_resize_embeddings

* add pipeline_model_mapping

* remote old setUp

* pass processor save_pretrained and load_pretrained

* remove pooling layer

* pass test_inputs_embeds_matches_input_ids

* pass test_model_is_small

* pass test_attention_outputs

* pass test_initialization

* pass test_model_get_set_embeddings

* pass test_single_forward

* skip test_disk_offload_bin and test_disk_offload_safetensors

* fix most tests

* pass test_protein_encoder_output

* remove useless code

* add EvollaForProteinText2Text

* pass test_saprot_output

* pass all EvollaModelTest test and remove processor test

* add processor test to its own file

* skip is_training since esm skipped it and the saprot code causes error when setting is_training True

* pass processor tests

* solve all except config

* pass most cases

* change init

* add doc to `configuration_evolla.py`

* remove image_processing test

* remove extra processor test

* remove extra modules

* remove extra modules

* change all configs into one config

* pass all evolla test

* pass `make fixup`

* update short summary

* update Evolla-10B-hf

* pass check_dummies.py and check_code_quality

* fix  `tests/models/auto/test_tokenization_auto.py::AutoTokenizerTest::test_model_name_edge_cases_in_mappings`

* remove dummy codes

* change format

* fix llava issue

* update format

* update to solve llama3 access issue

* update to make forward right

* solve processor save load problem from instructblip solution

* remove unexpected file

* skip `test_generation_tester_mixin_inheritance`

* add `test_single_forward_correct` and `test_inference_natural_language_protein_reasoning`

* add `modular_evolla.py`

* solved issue #36362

* run `make fixup`

* update modular

* solve float32 training

* add fix

* solve `utils/check_docstrings.py`

* update

* update

* update

* remove other files and replace sequential and einsum

* add use case in document

* update the models

* update model

* change some wrong code

* Update src/transformers/models/evolla/modular_evolla.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update src/transformers/models/evolla/modular_evolla.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update src/transformers/models/evolla/modular_evolla.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* Update src/transformers/models/evolla/modular_evolla.py

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* fix issues mentioned in PR

* update style and rearrange the placement

* fix return_dict argument issue

* solve SaProtConfig issue

* Solve EvollaSaProtRotaryEmbedding issue

* solve attention_mask issue

* solve almosst all issues

* make style

* update config

* remove unrelated pickle file

* delete pickle files

* fix config

* simplify a lot

* remove past k-v from encoder

* continue work

* style

* skip it from init

* fix init

* fix init

* simplify more

* fill in docstrings

* change test for generation

* skip test

* fix style

---------

Co-authored-by: Chenchen Han <13980209828@163.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-07-25 19:11:57 +02:00
Yih-Dar
2670da66ce update expected outputs for whisper after #38778 (#39304)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-07-25 16:48:10 +00:00
Yih-Dar
4b125e2993 fix kyutai tests (#39416)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
2025-07-25 18:42:04 +02:00
Anton Vlasjuk
a91653561e [Ernie 4.5] Post merge adaptations (#39664)
* ernie 4.5 fixes

* Apply style fixes

* fix

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-07-25 17:36:18 +02:00
Joao Gante
5d0ba3e479 [CI] revert device in test_export_static_cache (#39662)
* revert device

* add todo
2025-07-25 15:36:12 +00:00
Yoni Gozlan
17f02102c5 🚨[Fast Image Processor] Force Fast Image Processor for Qwen2_VL/2_5_VL + Refactor (#39591)
* init

* Force qwen2VL image proc to fast

* refactor qwen2 vl fast

* fix copies

* Update after PR review and update tests to use return_tensors="pt"

* fix processor tests

* add BC for min pixels/max pixels
2025-07-25 11:11:28 -04:00
revanth
3b3f9c0c46 fix(voxtral): correct typo in apply_transcription_request (#39572)
* fix(voxtral): correct typo in apply_transcription_request

* temporary wrapper: apply_transcrition_request

* Update processing_voxtral.py

* style: sort imports in processing_voxtral.py

* docs(voxtral): fix typo in voxtral.md

* make style

* doc update

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>
2025-07-25 12:09:44 +00:00
Raushan Turganbay
c392d47c9b [attention] fix test for packed padfree masking (#39582)
* fix most tests

* skip a few more tests

* address comments

* fix chameleon tests

* forgot to uncomment

* qwen has its own tests with images, rename it as well
2025-07-25 07:44:52 +00:00
lmarshall12
565c035a2e Add owlv2 fast processor (#39041)
* add owlv2 fast image processor

* add Owlv2ImageProcessorFast to Owlv2Processor image_processor_class

* add Owlv2ImageProcessorFast to Owlv2Processor image_processor_class

* change references to owlVit to owlv2 in docstrings for post process methods

* change type hints from List, Dict, Tuple to list, dict, tuple

* remove unused typing imports

* add disable grouping argument to group images by shape

* run make quality and repo-consistency

* use modular

* fix auto_docstring

---------

Co-authored-by: Lewis Marshall <lewism@elderda.co.uk>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2025-07-25 02:40:11 +00:00
eustlb
ad6fd2da0e [Voxtral] values for A10 runners (#39605)
* values for A10 runners

* make

* as for Llava

* does not apply to Voxtral
2025-07-24 18:52:35 +02:00
StevenBucaille
12b612830d [efficientloftr] fix model_id in tests (#39621)
fix: wrong EfficientLoFTR model id in tests
2025-07-24 10:41:06 +01:00
Eric Bezzam
c5a80dd6c4 🔴 Fix EnCodec internals and integration tests (#39431)
* EnCodec fixes and update integration tests.

* Apply padding mask when normalize is False.

* Update comment of copied function.

* Fix padding mask within modeling.

* Revert padding function.

* Simplify handling of padding_mask.

* Address variable codebook size.

* Add output for padding for consistency with original model, fix docstrings.

* last_frame_pad_length as int

* Update example code.

* Improve docstring/comments.

* Shorten expected output.

* Consistent docstring.

* Parameterize tests.

* Properties for derived variables.

* Update expected outputs from GitHub runner.

* Consistent outputs with runner GPUs.
2025-07-23 19:39:27 +02:00
Eric Bezzam
7a4e2e7868 Fix DAC integration tests and checkpoint conversion. (#39313)
* Fix DAC (slow) integration tests.

* Fix DAC conversion.

* Address comments

* Sync with main, uncomment nn.utils.parametrizations.weight_norm.

* Update DAC integration tests with expected outputs.

* Added info about encoder/decoder error and longer decoder outputs.

* Parameterize tests.

* Set expected values to GitHub runners.
2025-07-23 19:21:26 +02:00
Pablo Montalvo
ea56eb6bed Fix important models CI (#39576)
* relax test boundaries and fix from config

* eager is always supported.
2025-07-23 16:24:29 +02:00
llbdyiu66
a62f65a989 fix moe routing_weights (#39581)
* fix moe routing_weights

* fix ernie4_5_moe routing_weights

* fix integration test

---------

Co-authored-by: llbdyiu66 <llbdyiu66@users.noreply.github.com>
Co-authored-by: Vasqu <antonprogamer@gmail.com>
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
2025-07-23 11:20:23 +00:00
Sangbum Daniel Choi
d9b35c635e Mask2former & Maskformer Fast Image Processor (#35685)
* add maskformerfast

* test

* revert do_reduce_labels and add testing

* make style & fix-copies

* add mask2former and make fix-copies
TO DO:
	add test for mask2former

* make fix-copies

* fill docstring

* enable mask2former fast processor

* python utils/custom_init_isort.py

* make fix-copies

* fix PR's comments

* modular file update

* add license

* make style

* modular file

* make fix-copies

* merge

* temp commit

* finish up maskformer mask2former

* remove zero shot examples

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-07-23 02:47:47 +00:00
space_samurai
c6d0500d15 [WIP] Add OneformerFastImageProcessor (#38343)
* [WIP] OneformerFastImageProcessor

* update init

* Fully working oneformer image processor fast

* change Nearest to Neares exact interpolation where needed

* fix doc

---------

Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-07-22 20:41:39 +00:00
Manuel de Prada Corral
c338fd43b0 [cache refactor] Move all the caching logic to a per-layer approach (#39106)
* Squash for refactor: Replace monolithic cache classes with modular LayeredCache (#38077)

- Introduces CacheLayer and Cache base classes
- Ports Static, Dynamic, Offloaded, Quantized, Hybrid, etc. to use layers
- Implements method/attr dispatch across layers to reduce boilerplate
- Adds CacheProcessor hooks for offloading, quantization, etc.
- Updates and passes tests

* fix quantized, add tests

* remove CacheProcessorList

* raushan review, arthur review

* joao review: minor things

* remove cache configs, make CacheLayer a mixin (joaos review)

* back to storage inside Cache()

* remove cachebase for decorator

* no more __getattr__

* fix tests

* joaos review except docs

* fix ast deprecations for python 3.14: replace node.n by node.value and use `ast.Constant`

More verbose exceptions in `fix_docstring` on docstring formatting issues.

* Revert "back to storage inside Cache()"

This reverts commit 27916bc2737806bf849ce2148cb1e66d59573913.

* cyril review

* simplify cache export

* fix lfm2 cache

* HybridChunked to layer

* BC proxy object for cache.key_cache[i]=...

* reorder classes

* bfff come on LFM2

* better tests for hybrid and hybridChunked

* complete coverage for hybrid chunked caches (prefill chunking)

* reimplementing HybridChunked

* cyril review

* fix ci

* docs for cache refactor

* docs

* oopsie

* oopsie

* fix after merge

* cyril review

* arthur review

* opsie

* fix lfm2

* opsie2
2025-07-22 16:10:25 +02:00
Ákos Hadnagy
015b62bf3e Add AMD GPU expectations for LLaVA tests (#39486)
* Add AMD GPU expectation to llava tests

* FMT

* Remove debug print

* Address review  comments
2025-07-22 14:01:54 +00:00
Ákos Hadnagy
b62557e712 Add AMD expectations to Mistral3 tests (#39481)
Add AMD expectations to mistral3 tests
2025-07-22 15:40:16 +02:00
Ákos Hadnagy
ef99537f37 Add AMD test expectations to DETR model (#39539)
* Add AMD test expectations to DETR model

* Fix baseline expectation

* Address review comments

* Make formatting a bit more consistent
2025-07-22 12:07:10 +00:00