Add aya (#36521)

* initial commit * small fix * move stuff to image processing file * remove stuff in validate turn and fix return tensor * remove liquid stuff * in the process of addressing comments * changes to get the right tokenization * new __init__ works * fixing defulat std and mean * works * small testing scipt -- to be deleted before merge * remove redundant code * addressing comments * fix inits, add docs templates * refactor processor, switch to gotocr image processor * remove image proc from init * refactor to working llava-style architecture * Change AyaVisionModel to AyaVisionForConditionalGeneration * add tests * fixups * update doc * Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model * better variable names + remove code paths * Updates to aya_vision.md * address comments * adding copied from * make style and remove unused projector_hidden_act from config * sort init * include usage of fast image proc and proc on cuda in doc * update checkpoint iin test processor * update checkpoint in test processor 2 * remove test_model and update docstring * skip failing tests --------- Co-authored-by: Saurabh Dash <saurabh@cohere.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
2025-03-04 12:24:33 +01:00
parent c0f8d055ce
commit 84f0186e89
17 changed files with 1928 additions and 1 deletions
--- a/tests/generation/test_utils.py
+++ b/tests/generation/test_utils.py
@@ -113,7 +113,18 @@ from transformers.utils import is_sklearn_available


 # TODO: raushan remove this when VLMs start accepting input embeds
-VLM_CLASS_NAMES = ["llava", "idefics2", "idefics3", "mllama", "paligemma", "emu3", "gotocr2", "qwen2vl", "qwen2_5_vl"]
+VLM_CLASS_NAMES = [
+    "llava",
+    "idefics2",
+    "idefics3",
+    "mllama",
+    "paligemma",
+    "emu3",
+    "gotocr2",
+    "qwen2vl",
+    "qwen2_5_vl",
+    "ayavision",
+]


 class GenerationTesterMixin: