SmolVLM2 (#36126) · 4397dfcb71 - HuggingFace_transformer

SmolVLM2 (#36126)

Some checks failed

Release - Conda / build_and_package (push) Has been cancelled

Details

Secret Leaks / trufflehog (push) Has been cancelled

Details

* smolvlm init

* updates

* fixing bugs

* minimal run, no checks

* minimal run, no checks

* passing first check + adding url support

* updating video dataloading logic

* fixing image logic

* trying modular, but fails

* modular is working, changing processor to match PR comments and general transformers logic

* fixing kwargs

* offloading video loading logic to  image_util

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* fixing circleci code formatting errors

* update

* add idefics3-based tests

* add keyword to all

* add PreTrainedModel

* updateing video loading logic

* working inference

* updates for PR comments

* updates for PR comments

* moving SmolVLMPretrainedModel higher to fix import error

* CI test pass

* CI test pass

* removing lambda

* CI test pass

* CI test pass

* CI test pass

* CI test pass

* CI test pass

* CI test pass

* processor tests

* add example in docs

* typo

* fix copies

* skip compile tests - sdpa for VisionTransformer

* fix init

* raise import error for num2words

* update doc for FA2

* more doc fix

* CI

* updates for PR comments

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Joshua Lochner <admin@xenova.com>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* fixing processor -- tokenizer not defined properly, (gpt2 tokenizer), and does not have the attributes of fake image token, etc

* adding smolvlm to VQA models

* removing vqa auto class

* Update src/transformers/models/smolvlm/processing_smolvlm.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* removing smolvlmvisiontransformer from index.md

* my bad, video processing had typos

* fixing docs

* renaming params in SmolVLMModel.inputs_merger

* removing un-needed dtype/device in model forward

* ruff for CI

* update docs

* Update docs/source/en/model_doc/smolvlm.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* return cache position

* return cache position

* return cache also in modular

* needed to run modular again

* fix training tests

* push vectorized inputs merger

* format

* format

* reduce number of mappings

* addressing PR comments

* happy CI, happy me :)

* skip non-nested images

* adjust integration test for smaller GPUs

* format

* fix kwargs in chat template apply

* skip this for now

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Joshua Lochner <admin@xenova.com>

This commit is contained in:

Orr Zohar

2025-02-20 06:00:26 -08:00

committed by

GitHub

parent f2ab182dca

commit 4397dfcb71

29 changed files with 5108 additions and 4 deletions

									
										1

docs/source/en/index.md
									
												View File
												
				@@ -317,6 +317,7 @@ Flax), PyTorch, and/or TensorFlow.

				|                           [SEW](model_doc/sew)                           |       ✅        |         ❌         |      ❌      |

				|                         [SEW-D](model_doc/sew-d)                         |       ✅        |         ❌         |      ❌      |

				|                        [SigLIP](model_doc/siglip)                        |       ✅        |         ❌         |      ❌      |

				|                       [SmolVLM](model_doc/smolvlm)                       |       ✅        |         ❌         |      ❌      |

				|        [Speech Encoder decoder](model_doc/speech-encoder-decoder)        |       ✅        |         ❌         |      ✅      |

				|                 [Speech2Text](model_doc/speech_to_text)                  |       ✅        |         ✅         |      ❌      |

				|                      [SpeechT5](model_doc/speecht5)                      |       ✅        |         ❌         |      ❌      |

SmolVLM2 (#36126) Some checks failed Release - Conda / build_and_package (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details

1 docs/source/en/index.md Unescape Escape View File

SmolVLM2 (#36126)

Some checks failed

Release - Conda / build_and_package (push) Has been cancelled

Details

Secret Leaks / trufflehog (push) Has been cancelled

Details

1

docs/source/en/index.md

View File