HuggingFace_transformer

Author	SHA1	Message	Date
Guang Yang	d2ae766836	Export SmolvLM (#39614 ) Export SmolVLM for ExecuTorch	2025-08-05 16:20:23 +02:00
Arthur	20ce210ab7	Revert "remove dtensors, not explicit (#39840 )" (#39912 ) * Revert "remove dtensors, not explicit (#39840)" This did not work with generation (lm_head needs extra care!) This reverts commit `6dfd561d9c`. * update * style?	2025-08-05 15:12:14 +02:00
Raushan Turganbay	2589a52c5c	Fix aria tests (#39879 ) * fix aria tests * awful bug * fix copies * fix tests * fix style * revert this	2025-08-05 13:48:47 +02:00
Yuanyuan Chen	98a3c49135	Replace video_fps with fps in tests (#39898 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-08-05 10:39:55 +00:00
Jan Netík	0bd91cc822	Add support for `ModernBertForMultipleChoice` (#39232 ) * implement ModernBertForMultipleChoice * fixup, style, repo consistency * generate modeling_modernbert * add tests + docs * fix test	2025-08-04 20:45:43 +02:00
Yih-Dar	ee7eb2d0b1	Update cohere2 vision test (#39888 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-08-04 20:08:18 +02:00
Cyril Vallez	380b2a0317	Rework add-new-model-like with modular and make test filenames coherent (#39612 ) * remove tf/flax * fix * style * Update add_new_model_like.py * work in progress * continue * more cleanup * simplify and first final version * fixes -> it works * add linter checks * Update add_new_model_like.py * fix * add modular conversion at the end * Update add_new_model_like.py * add video processor * Update add_new_model_like.py * Update add_new_model_like.py * Update add_new_model_like.py * fix * Update image_processing_auto.py * Update image_processing_auto.py * fix post rebase * start test filenames replacement * rename all test_processor -> test_processing * fix copied from * add docstrings * Update add_new_model_like.py * fix regex * improve wording * Update add_new_model_like.py * Update add_new_model_like.py * Update add_new_model_like.py * start adding test * fix * fix * proper first test * tests * fix * fix * fix * fix * modular can be used from anywhere * protect import * fix * Update add_new_model_like.py * fix	2025-08-04 14:41:09 +02:00
Pavel Iakubovskii	16d6faef9a	[core] Fix attn_implementation setter with missing `sub_configs` (#39855 ) * fix * add sub_configs * remove case for attention setter * fix None * Add test * Fix sub-configs * fix tests_config * fix consistency * fix fsmt * fix	2025-08-04 11:35:09 +01:00
Akib Jawad	2a9febd632	Add support for including in-memory videos (not just files/urls) in apply_chat_template (#39494 ) * added code for handling video object ,as dictionary of frames and metadata, in chat template * added new test where videos are passed as objects (dict of frames, metadata) in the chat template * modified hardcoded video_len check that does not match with increased number of tests cases. * Modify hardcoded video_len check that fails with increased number of tests * update documentation of multi-modal chat templating with extra information about including video object in chat template. * add array handling in load_video() * temporary test video inlcuded * skip testing smolvlm with videos that are list of frames * update documentation & make fixup * Address review comments	2025-08-04 11:49:42 +02:00
Arthur	6dfd561d9c	remove dtensors, not explicit (#39840 ) * remove dtensors, not explicit Co-authored-by: 3outeille <3outeille@users.noreply.github.com> * style * fix test * update * as we broke saving try to fix * output layouts should exit * nit * devicemesh exists if it was distributed * use _device_mesh of self * update * lol * fix * nit * update * fix! * this??? * grumble grumble * ? * fuck me --------- Co-authored-by: 3outeille <3outeille@users.noreply.github.com>	2025-08-01 22:02:47 +02:00
Yoni Gozlan	7b4d9843ba	Add fast image processor Janus, Deepseek VL, Deepseek VL hybrid (#39739 ) * add fast image processor Janus, deepseek_vl, deepseek_vl_hybrid * fix after review	2025-08-01 12:20:08 -04:00
Lysandre Debut	88ead3f518	Fix responses add tests (#39848 ) * Quick responses fix * [serve] Fix responses API and add tests * Remove typo * Remove typo * Tests	2025-08-01 18:06:08 +02:00
rziga	3951d4ad5d	Add MM Grounding DINO (#37925 ) * first commit Added modular implementation for MM Grounding DINO from starting point created by add-new-model-like. Added conversion script from mmdetection to huggingface. TODO: Some tests are failing so that needs to be fixed. * fixed a bug with modular definition of MMGroundingDinoForObjectDetection where box and class heads were not correctly assigned to inner model * cleaned up a hack in the conversion script * Fixed the expected values in integration tests Cross att masking and cpu-gpu consistency tests are still failing however. * changes for make style and quality * add documentation * clean up contrastive embedding * add mm grounding dino to loss mapping * add model link to config docstring * hack fix for mm grounding dino consistency tests * add special cases for unused config attr check * add all models and update docs * update model doc to the new style * Use super_kwargs for modular config * Move init to the _init_weights function * Add copied from for tests * fixup * update typehints * Fix-copies for tests * fix-copies * Fix init test * fix snippets in docs * fix consistency * fix consistency * update conversion script * fix nits in readme and remove old comments from conversion script * add license * remove unused config args * remove unnecessary if/else in model init * fix quality * Update references * fix test * fixup --------- Co-authored-by: qubvel <qubvel@gmail.com>	2025-08-01 15:43:23 +01:00
Arthur	c962f1515e	[`attn_implementation`] remove recursive, allows custom kernels with wrappers (#39823 ) * fix? * fixme and style * Update src/transformers/modeling_utils.py * update * update * fix * small fixees * nit * nits * fix init check? * fix * fix default * or fucks me * nits * include a small nit * does this make it hapy? * fixup * fix the remaining ones	2025-08-01 12:18:28 +02:00
Raushan Turganbay	d3b8627b56	[VLMs] split out "get placeholder mask" to helper (#39777 ) * batch upidate all models * update * forgot about llava onevision * update * fix tests * delete file * typo * fix emu3 once and forever * update cohere2 vision as well	2025-08-01 08:01:06 +00:00
Raushan Turganbay	e1688d28d3	[Model] Cohere2 Vision (#39810 ) * Add cohere2_vision to support CohereLabs/command-a-vision-07-2025 * update and add modualr file * update processors and check with orig impl later * delete unused files * image processor reduce LOC and re-use GotOCR2 * update the config to use modular * model tests pass * processor fixes * check model outputs decorator * address one more comment * Update tokens. Temp - need to read from tokenizer' * fix for multi-gpu * Fix image token handling * upadte image token expansion logic * fix a few issues with remote code loading * not related but modular forces us to change all files now * Add overview and code sample to cohere vision docs * add scripts. TMP. * Update inference script * Create script * set dtype in export script * TO revert: modular export fix * Fix scripts * Revert "TO revert: modular export fix" This reverts commit bdb2f305b61027a05f0032ce70d6ca698879191c. * Use modular weights * Upload to hub Removed OOD weights ad script * Updated docs * fix import error Update docs Added pipeline test * Updated docs * Run modular script remove modular for config Added patch_size Added docstrings in modular Fix OOM Add docs, fixup integration tests. 8-gpu passing * tiny updates * address comments + fixup * add test for chat template * check model outputs workaround * aya vision fix check model inputs * Revert "add test for chat template" This reverts commit 42c756e397f588d76b449ff1f93292d8ee0202d8. * reveert more changes * last revert * skip and merge * faulty copy from --------- Co-authored-by: Julian Mack <julian.mack@cohere.com> Co-authored-by: kyle-cohere <kyle@cohere.com>	2025-07-31 10:57:34 +00:00
Jeff Zhang	cb289ad243	feat(tokenization): add encode_message to tokenize messages one by one (#39507 ) * feat(tokenization): add encode_message to tokenize messages one by one * Fix the `encode_message` method, remove the `add_generation_prompt` parameter and add the corresponding error handling. Update the document to reflect this change and verify the error handling in the test. * Optimize the `encode_message` method, improve the processing logic of the empty dialogue history, and ensure that the chat template can be applied correctly when the dialogue history is empty. Update the document to reflect these changes. * The `_encode_message` method is deleted, the message coding logic is simplified, and the functional integrity of the `encode_message` method is ensured. Update the document to reflect these changes. * Docs fix * Revert changes in docstring of pad() * Revert changes in docstring * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Repair the call of the `encode_message` method, update it to `encode_message_with_chat_template` to support the chat template, and adjust the relevant test cases to reflect this change. * Optimize the call format of the `apply_chat_template` method, and merge multi-line calls into a single line to improve code readability. --------- Co-authored-by: pco111 <15262555+pco111@user.noreply.gitee.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-31 10:55:45 +02:00
Joao Gante	4f93cc9174	fix: providing a tensor to cache_position in model.generate kwargs always crashes because of boolean test (#39300 ) * fix: cache_position: RuntimeError: Boolean value of Tensor with more than one value is ambiguous * test cache_position * move test * propagate changes --------- Co-authored-by: Masataro Asai <guicho2.71828@gmail.com>	2025-07-30 17:30:28 +00:00
Yuanyuan Chen	1e0665a191	Simplify conditional code (#39781 ) * Use != Signed-off-by: cyy <cyyever@outlook.com> * Use get Signed-off-by: cyy <cyyever@outlook.com> * Format * Simplify bool operations Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-30 12:32:10 +00:00
Cyril Vallez	67cfe11528	Fix Evolla and xLSTM tests (#39769 ) * fix all evolla * xlstm	2025-07-30 09:51:55 +02:00
Cyril Vallez	ddd2100767	Fix OmDet test after arg deprecation (#39766 ) fix arg name	2025-07-29 22:10:36 +02:00
Manuel de Prada Corral	c4e2069898	Fix Cache.max_cache_len max value for Hybrid models (#39737 ) * fix gemma * fix min * fix quant init issue * fix gemma 3n * skip quant cache test * fix modular * new test for Gemma * include cyril change --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-29 17:12:50 +02:00
Raushan Turganbay	1ad216bd7d	[modenbert] fix regression (#39750 ) * fix regression * add FA2 test	2025-07-29 16:58:59 +02:00
Çağrı Tuğrul Canbol	fb141e2c90	Support loading Qwen3 MoE GGUF (#39638 ) * support loading qwen3 gguf * qwen3moe test cases * fix whitespaces * fix ggml tests	2025-07-29 13:44:44 +00:00
Raushan Turganbay	ccb2e0e03b	Fix GPT2 with cross attention (#39754 ) * fix * use new mask API * style * fix copies and attention tests * fix head pruning tests	2025-07-29 15:40:31 +02:00
Yih-Dar	dfd616e658	Avoid OOM when other tests are failing (#39758 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-29 15:35:44 +02:00
Yuanyuan Chen	95faabf0a6	Apply several ruff SIM rules (#37283 ) * Apply ruff SIM118 fix Signed-off-by: cyy <cyyever@outlook.com> * Apply ruff SIM910 fix Signed-off-by: cyy <cyyever@outlook.com> * Apply ruff SIM101 fix Signed-off-by: cyy <cyyever@outlook.com> * Format code Signed-off-by: cyy <cyyever@outlook.com> * More fixes Signed-off-by: cyy <cyyever@outlook.com> --------- Signed-off-by: cyy <cyyever@outlook.com>	2025-07-29 11:40:34 +00:00
Yih-Dar	de8d0cec30	update `GemmaIntegrationTest::test_model_2b_bf16_dola` again (#39731 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-29 11:42:55 +02:00
Yao Matrix	f3598a95c7	extend more trainer test cases to XPU, all pass (#39652 ) extend more trainer test cases to XPU Signed-off-by: Yao, Matrix <matrix.yao@intel.com>	2025-07-29 10:51:00 +02:00
Raushan Turganbay	75794792ad	BLIPs clean-up (#35560 ) * blips clean up * update processor * readability * fix processor length * fix copies * tmp * update and fix copies * why keep these, delete? * fix test fetcher * irrelevant comment * fix tests * fix tests * fix copies	2025-07-29 10:03:06 +02:00
Ramesh	4f8f51be4e	Add Fast Segformer Processor (#37024 ) * Add Fast Segformer Processor * Modified the params according to segformer model * modified test_image_processing_Segformer_fast args - removed redundant params like do_center_crop,center_crop which aren't present in the original segformer class * added segmentation_maps processing logic form the slow segformer processing module with references from beitimageprocessing fast * fixed code_quality * added recommended fixes and tests to make sure everything processess smoothly * Fixed SegmentationMapsLogic - modified the preprocessing of segmentation maps to use tensors - added batch support * fixed some mismatched files * modified the tolerance for tests * use modular * fix ci --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-07-28 19:22:32 +00:00
Avigyan Sinha	c353f2bb5e	Superpoint fast image processor (#37804 ) * feat: superpoint fast image processor * fix: reran fast cli command to generate fast config * feat: updated test cases * fix: removed old model add * fix: format fix * Update src/transformers/models/superpoint/image_processing_superpoint_fast.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * fix: ported to torch and made requested changes * fix: removed changes to init * fix: init fix * fix: init format fix * fixed testcases and ported to torch * fix: format fixes * failed test case fix * fix superpoint fast * fix docstring --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-07-28 18:15:06 +00:00
Raushan Turganbay	1c6b47451d	Fix cache-related tests (#39676 ) * fix * fix kyutai at last * fix unrelated tests and copies * update musicgen as well * revert tensor * fix old test failures * why it wasn't added?	2025-07-28 17:30:11 +02:00
Eric Bezzam	7623aa3e5f	Fix `Qwen2AudioForConditionalGeneration.forward()` and `test_flash_attn_kernels_inference_equivalence` (#39503 ) * Add missing cache_position argument. * Pass cache_position to language model. * Overwrite prepare_inputs_for_generation. * Set model to half precision for Flash Attention test. * Cast model to bfloat16.	2025-07-28 16:35:08 +02:00
Yih-Dar	28f2619868	skip `Glm4MoeModelTest::test_torch_compile_for_training` (#39670 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-28 16:30:40 +02:00
Yih-Dar	88aed92b59	Update `QAPipelineTests::test_large_model_course` after #39193 (#39666 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-28 16:26:49 +02:00
Cyril Vallez	686bb3b098	Remove all expired deprecation cycles (#39725 ) * remove all deprecation cycles * style * fix * remove * remove * fix * Update modular_dpt.py * back * typo * typo * final fix * remove all args	2025-07-28 15:43:41 +02:00
Raushan Turganbay	b56d721397	[configuration] remove redundant `classmethod` (#38812 ) * remove redundant classmethod * warning message, add space between words * fix tests * fix copies	2025-07-28 10:38:48 +00:00
Raushan Turganbay	8b237b8639	[processors] add tests for helper fn (#39629 ) * add tests for helpers * duplicate test for each model * why llava next video has no helper * oops must have been in the commit * fix test after rebase * add copy from	2025-07-28 09:41:58 +00:00
BUI Van Tuan	6a61e16626	Fix missing initialization of `FastSpeech2Conformer` (#39689 ) * fix missing initialization of FastSpeech2Conformer * switch order and reactivate tests --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-28 10:47:39 +02:00
Cyril Vallez	18a7c29ff8	More robust tied weight test (#39681 ) * Update test_modeling_common.py * remove old ones * Update test_modeling_common.py * Update test_modeling_common.py * add * Update test_modeling_musicgen_melody.py	2025-07-25 22:03:21 +02:00
Garrett Goon	97f8c71f52	Add padding-free to Granite hybrid moe models (#39677 ) * start fixing kwarg handling * fmt * updates padding free tests * docs * add missing kwargs modeling_granitemoe.py * run modular util * rm unrelated changes from modular util	2025-07-25 20:10:50 +02:00
Cyril Vallez	d6e9f71a6e	Fix tied weight test (#39680 ) Update test_modeling_common.py	2025-07-25 20:09:33 +02:00
lgai-exaone	c06d4cd6ce	Add EXAONE 4.0 model (#39129 ) * Add EXAONE 4.0 model * Refactor EXAONE 4.0 modeling code * Fix cache slicing on SWA + FA2 * Fix cache slicing on FA2 + HybridCache * Update EXAONE 4.0 modeling code for main branch * Update o_proj for asymmetric projection * Address PR feedback * Add EXAONE 4.0 docs * Update EXAONE 4.0 modeling code for main branch * update * fix updates * updates * fix * fix * fix --------- Co-authored-by: Arthur <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-07-25 19:58:28 +02:00
Park Woorak	3e4d584a5b	Support `typing.Literal` as type of tool parameters or return value (#39633 ) * support `typing.Literal` as type of tool parameters * validate the `args` of `typing.Literal` roughly * add test to get json schema for `typing.Literal` type hint * fix: add `"type"` attribute to the parsed result of `typing.Literal` * test: add argument `booleanish` to test multi-type literal * style: auto fixup	2025-07-25 17:51:28 +00:00
Arthur	300d42a43e	Add ep (#39501 ) * EP + updates Co-authored-by: Nouamane Tazi <NouamaneTazi@users.noreply.github.com> Co-authored-by: drbh <drbh@users.noreply.github.com> * remove unrelated change * not working yet but let's see where it goes! * update the api a bit * udpate * where I am at for now * fix ep * refactor the API * yups * fix * fixup * clean modeling * just support llama4 for now! * properly avoid * fix * nits * Update src/transformers/models/llama4/modeling_llama4.py * Update src/transformers/integrations/tensor_parallel.py * style * ,,,, * update --------- Co-authored-by: Nouamane Tazi <NouamaneTazi@users.noreply.github.com> Co-authored-by: drbh <drbh@users.noreply.github.com>	2025-07-25 19:46:17 +02:00
Cyril Vallez	6630c5b714	Add xlstm model (#39665 ) * Add xLSTM cleanly with optimizations. * Fix style. * Fix modeling test. * Make xLSTM package optional. * Fix: Update torch version check. * Fix: Bad variable naming in test. * Fix: Import structure cleaning with Ruff. * Fix: Update docstrings. * Fix: Mitigate unused config attr tests by explicit usage. * Fix: Skip tests, if xlstm library is not installed. * Feat: Enable longer context window for inference by chunking. * Fix: Make training test pass by lowering target accuracy. * Chore: Increase test verbosity for failing generation test. * Update docs/source/en/model_doc/xlstm.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix: Make xlstm available even without CUDA. * Chore: Remove unnecessary import. * Fix: Remove BOS insertion. * Chore: Improve xLSTMCache documentation. * Integrate basic xLSTM fallback code. * Chore: Remove unnecessary import. * Chore: Remove duplicate LayerNorm. * chore: update copyright, minor reformatting * fix: refactor mLSTMStateType due to missing torch import * fix: add missing import * Chore: Replace einops. * fix: apply ruff formatting * fix: run `make fix-copies` to re-generate dummy_pt_objects.py * fix: make type hints Python 3.9 compatible * fix: remove obsolete import * fix: remove obsolete method from docs * chore: remove obsolete `force_bos_token_insert` from config * Chore: Remove duplicated xLSTMCache class. * Fix: Formatting of modeling_xlstm.py * Chore: Remove xlstm package requirement from test. Re-add update_rnn_state. * Fix: Update xLSTMCache docstring. * Feat: Add proper initialization of xLSTM. * Chore: Re-format files. * Chore: Adapt format. * Fix: xLSTMCache import restructuring. * Fix: Add __all__ lists to modeling and configuration files. * Chore: Reformat. * Fix: Remove unnecessary update_rnn_state function. * Fix: Undo test accuracy quickfix. * Fix: Update copyright year, remvoe config copy. * Chore: Flatten all internal configs to xLSTMConfig. * Fix: Unused config variables check. * Chore: Remove unnecessary imports. * Fix: Unify xlstm cache argument from batch_size to max_batch_size. * Chore: Remove bad default arg value for xLSTMCache. * Chore: Rename core configuration arguments to HF default in xLSTM. * Chore: Fix formatting. * Fix: xLSTM Cache config access. * Fix: Update xlstm tests for config update. * Feat: Re-add embbeding_dim, num_blocks config options for compat with xLSTM-7B. * Fix: Configuration xLSTM python3.9 syntax. * Fix: Difference to main in test_utils.py assertion. * Fix: Bad syntax in xlstm config for python3.9. * Fix: xLSTMConfig docstring. * Fix: xLSTMConfig docstring. * Fix typing issues in xLSTM and BeiT, Paligemma. * Fix: Exclude xLSTM from test cache utils. * Chore: Fix style. * Chore: Fix format. * Chore: Remove unnecessary LayerNorm, NormLayer layer abstractions. * Chore: Remove asserts and replace with ValueErrors. * Chore: Update __init__.py structure of xLSTM. * Chore: Clean xLSTM initialization of weights. * Fix index names in modeling_xlstm.py * Update xlstm model test typing annotations. * Fix: Remove all asserts. * Revert changes to the main __init__.py * Fix: Move xLSTMCache to modeling_xlstm.py * Fix: Remove xLSTMForCausalLM mapping from modeling_auto.py * Remove xLSTMCache from dummy_pt_objects.py * Fix: Remove extended torchdynamo compilation check integrating cuda graph captures. * Revert test_cache_utils.py xLSTM change. * Fix: Move xLSTM init functions before init call. * Remove xLSTMCache from generation utils. * Fix: Clean xLSTM init functionality for recursive calls. * Fix: Move xLSTMCache before its first call. * Fix formatting. * Add partial docstring for xLSTMModel forward. * Fix xLSTMCache docstring in xLSTMModel. * Remove xLSTMCache from public documentation. Update auto_docstring. * Remove all agressive shape comments * style * Fix names * simplify * remove output_hidden_states * Update modeling_xlstm.py * Update modeling_xlstm.py * Update test_modeling_xlstm.py * Update modeling_xlstm.py * Update modeling_xlstm.py * fix * fix * style * style --------- Co-authored-by: Korbinian Poeppel <korbinian.poeppel@nx-ai.com> Co-authored-by: Korbinian Pöppel <37810656+kpoeppel@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sebastian Böck <sebastian.boeck@nx-ai.com> Co-authored-by: Korbinian Poeppel <poeppel@ml.jku.at>	2025-07-25 19:39:17 +02:00
Armaghan Shakir	69cff312f5	Add support for DeepseekAI's DeepseekVL (#36248 ) * upload initial code * update deepseek-vl adaptor * update hierarchy of vision model classes * udpate aligner model * add text model * Added Image Processor * Added Image Processor * Added Image Processor * apply masks * remove projection; add aligner * remove interpolate_pos_encoding * remove unused params in config * cleaning * Add the __init__ file * added processing deepseek_vl class * modified the deepseek-vl processor * modified the deepseek-vl processor * update __init__ * Update the image processor class name * Added Deepseek to src/transformers/__init__.py file * Added Deepseek to image_processing_auto.py * update the __init__ file * update deepseek_vl image processor * Update Deepseek Processor * upload fast image processor * Revert "upload fast image processor" This reverts commit 68c8fd50bafbb9770ac70c9de02448e2519219b4. * update image processor * flatten heirarchy * remove DeepseekVLModel * major update (complete modeling) * auto modeling and other files * formatting * fix quality * replace torchvision in modeling * set default do_normalize to False * add fast image processor template using tool * update image processors * add fast image processor to other files * update liscense * Added deepseek image testcases * update image test * update processor * write CHAT_TEMPLATE * update model for processor * fix processor * minor fixes and formatting * fix image processing and tests * fix interpolation in sam * fix output_attentions in DeepseekVLModel * upload test_modeling * fix tests because of vocab size * set use_high_res_vision=False in tests * fix all modeling tests * fix styling * remove explicit background_color from image processors * added test_processor * added test_processor * fix processor tests * update docs * update docs * update docs * update conversion script * Fixed typos * minor fixes from review - remove model_id comments in examples - remove from pre-trained auto mapping - move to image-text-to-text from vision-to-seq in auto mapping - add image_token_index to __init__ for config - remove outdated temporary config in conversion script - update example to use chat_template in docstring example - update liscense 2021->2025 * fix type in config docstring Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> * update get_image_features * fix config * improve DeepseekVLImageProcessor.preprocess * return image_hidden_states * use AutoTokenizer and AutoImageProcessor in Processor * fix model outputs * make num_image_tokens configurable * fix docstring of processor * move system prompt to chat template * fix repo consistency * fix return_dict * replace SamVisionEncoder with SamVisionModel * update to remove deepcopy * 🛠️ Major Architectural Changes (Adds DeepseekVLHybrid) * fix quality checks * add missing hybrid in auto modeling * run make style * update sam_hq * update high_res_size in test * update docs following #36979 * update code with auto_docstring * update conversion scripts * fix style * fix failing test because of tuple * set weights_only=True in conversion script * use safetensors.torch.load_file instead of torch.load in conversion script * make output_dir optional in conversion script * fix code snippets in docs (now the examples work fine) * integration tests for DeepseekVL * update expected texts * make style * integration tests for DeepseekVLHybrid * fix class name * update expected texts for hybrid * run "make style" * update since changes in main * run make-style * nits since changes in main * undo changes in sam * fix tests * fix tests; update with main * update with main: output_attention/output_hidden_states * fix copied part in deepseek_vl * run fix-copies * fix output_hidden_states * sam: fix _init_weigths * use modular for DeepseekVL * make image processor more modular * modular: use JanusPreTrainedModel * janus: provide kwargs in loss * update processors in conversion script * Revert "sam: fix _init_weigths" This reverts commit db625d0c68956c0dad45edd7a469b6a074905c27. * run fix-copies --------- Co-authored-by: Shakib-IO <shakib.khan17@northsouth.edu> Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>	2025-07-25 19:18:50 +02:00
Xibin Bayes Zhou	45c7bfb157	Add evolla rebase main (#36232 ) * add evolla * adding protein encoder part * add initial processing test * save processor * add docstring * add evolla processor * add two test * change vision to protein * change resampler to sequence_compressor * change vision to protein * initial update for llama * add initial update for llamaForCausalLM * add `test_processor`, `test_saprot_output`, `test_protein_encoder_output` * change evolla, but still working on it * add test_single_forward * pass test_attention_outputs * pass test_hidden_states_output * pass test_save_load and test_from_pretrained_no_checkpoint * pass test_cpu_offload * skip some tests * update new progress * skip test_model_is_small * pass test_model_weights_reload_no_missing_tied_weights * pass test_model_get_set_embeddings * pass test_cpu_offload * skip test_resize_embeddings * add pipeline_model_mapping * remote old setUp * pass processor save_pretrained and load_pretrained * remove pooling layer * pass test_inputs_embeds_matches_input_ids * pass test_model_is_small * pass test_attention_outputs * pass test_initialization * pass test_model_get_set_embeddings * pass test_single_forward * skip test_disk_offload_bin and test_disk_offload_safetensors * fix most tests * pass test_protein_encoder_output * remove useless code * add EvollaForProteinText2Text * pass test_saprot_output * pass all EvollaModelTest test and remove processor test * add processor test to its own file * skip is_training since esm skipped it and the saprot code causes error when setting is_training True * pass processor tests * solve all except config * pass most cases * change init * add doc to `configuration_evolla.py` * remove image_processing test * remove extra processor test * remove extra modules * remove extra modules * change all configs into one config * pass all evolla test * pass `make fixup` * update short summary * update Evolla-10B-hf * pass check_dummies.py and check_code_quality * fix `tests/models/auto/test_tokenization_auto.py::AutoTokenizerTest::test_model_name_edge_cases_in_mappings` * remove dummy codes * change format * fix llava issue * update format * update to solve llama3 access issue * update to make forward right * solve processor save load problem from instructblip solution * remove unexpected file * skip `test_generation_tester_mixin_inheritance` * add `test_single_forward_correct` and `test_inference_natural_language_protein_reasoning` * add `modular_evolla.py` * solved issue #36362 * run `make fixup` * update modular * solve float32 training * add fix * solve `utils/check_docstrings.py` * update * update * update * remove other files and replace sequential and einsum * add use case in document * update the models * update model * change some wrong code * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/evolla/modular_evolla.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * fix issues mentioned in PR * update style and rearrange the placement * fix return_dict argument issue * solve SaProtConfig issue * Solve EvollaSaProtRotaryEmbedding issue * solve attention_mask issue * solve almosst all issues * make style * update config * remove unrelated pickle file * delete pickle files * fix config * simplify a lot * remove past k-v from encoder * continue work * style * skip it from init * fix init * fix init * simplify more * fill in docstrings * change test for generation * skip test * fix style --------- Co-authored-by: Chenchen Han <13980209828@163.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-07-25 19:11:57 +02:00
Yih-Dar	2670da66ce	update expected outputs for whisper after #38778 (#39304 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-07-25 16:48:10 +00:00

1 2 3 4 5 ...

5260 Commits