[Model] Cohere2 Vision (#39810)

* Add cohere2_vision to support CohereLabs/command-a-vision-07-2025

* update and add modualr file

* update processors and check with orig impl later

* delete unused files

* image processor reduce LOC and re-use GotOCR2

* update the config to use modular

* model tests pass

* processor fixes

* check model outputs decorator

* address one more comment

* Update tokens. Temp - need to read from tokenizer'

* fix for multi-gpu

* Fix image token handling

* upadte image token expansion logic

* fix a few issues with remote code loading

* not related but modular forces us to change all files now

* Add overview and code sample to cohere vision docs

* add scripts. TMP.

* Update inference script

* Create script

* set dtype in export script

* TO revert: modular export fix

* Fix scripts

* Revert "TO revert: modular export fix"

This reverts commit bdb2f305b61027a05f0032ce70d6ca698879191c.

* Use modular weights

* Upload to hub

Removed OOD weights ad script

* Updated docs

* fix import error

Update docs

Added pipeline test

* Updated docs

* Run modular script

remove modular for config

Added patch_size

Added docstrings in modular

Fix OOM

Add docs, fixup integration tests. 8-gpu passing

* tiny updates

* address comments + fixup

* add test for chat template

* check model outputs workaround

* aya vision fix check model inputs

* Revert "add test for chat template"

This reverts commit 42c756e397f588d76b449ff1f93292d8ee0202d8.

* reveert more changes

* last revert

* skip and merge

* faulty copy from

---------

Co-authored-by: Julian Mack <julian.mack@cohere.com>
Co-authored-by: kyle-cohere <kyle@cohere.com>
This commit is contained in:
Raushan Turganbay
2025-07-31 12:57:34 +02:00
committed by GitHub
parent 6c3f27ba61
commit e1688d28d3
32 changed files with 2375 additions and 48 deletions

View File

@@ -4677,9 +4677,13 @@ class ModelTesterMixin:
sub_config = getattr(config, key)
update_config_for_flex(sub_config)
model = model_class(config).to(device=torch_device)
model.set_attn_implementation("flex_attention")
self.assertTrue(model.config._attn_implementation == "flex_attention")
if model_class._can_set_attn_implementation():
model = model_class(config).to(device=torch_device)
model.set_attn_implementation("flex_attention")
self.assertTrue(model.config._attn_implementation == "flex_attention")
else:
config._attn_implementation = "flex_attention"
model = model_class(config).to(device=torch_device)
# Elaborate workaround for encoder-decoder models as some do not specify their main input
dummy_inputs = {model.main_input_name: inputs_dict[model.main_input_name].to(torch_device)}