Add InternVL (2.5 MPO) (#35968)
* initial commit * add convert internvl * add first end-to-end working internvl * nit prompt and image proc * add working chat template * add conversion llama-based models * add tests * pass all tests * fix isort * fix modular after main merge * add video processing for internvl * add support for interlaced images and videos * Remove processing and config from modular, add more tests * add llama model tests * Modify processor for compatibility with refactored got ocr image processor * add comments in processor * Add docs and nits * change video processing to use custom sample_indices_fn * rebase and fix tests * add processor tests * Add changes Raushan review * Use the new attention interface for the vision model * nits * add support for custom video_load_backend * remove mention to InternVLTokenizer * refactor vision model to simplify logic * refactor processor for better readibility * fix copies * fix require av processor test * refactor internVL vision * Update processor and fix processing tests * fix docstring * update convert_weights for internvl3 * change image processor to fast by default * remove do_center_crop=True in convert_weights * force use_cache to True * push_to_hub before reloading * fix internVLVision for larger models * update convert weight for qk norm * fix convert_weights * fix eos_token_id in convert * update docs and integration tests * make modifs after review * fix wrong k_norm and reduce modular * change image_token_index to image_token_id * change checkpoint to OpenGVLab org * last nits * explicitely del self.num_key_value_groups * add extra special tokens
This commit is contained in:
@@ -156,6 +156,7 @@ IGNORE_NON_TESTED = (
|
||||
"Llama4VisionModel", # Building part of bigger (tested) model. # TODO: add tests
|
||||
"Emu3VQVAE", # Building part of bigger (tested) model
|
||||
"Emu3TextModel", # Building part of bigger (tested) model
|
||||
"InternVLVisionModel", # Building part of bigger (tested) model
|
||||
"JanusVisionModel", # Building part of bigger (tested) model
|
||||
"TimesFmModel", # Building part of bigger (tested) model
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user