Yoni Gozlan
5f0c181f4e
Uniformize kwargs for image-text-to-text processors ( #32544 )
...
* uniformize FUYU processor kwargs
* Uniformize instructblip processor kwargs
* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2
* Uniformize llava_next processor
* Fix save_load test for processor with chat_template only as extra init args
* Fix import Unpack
* Fix Fuyu Processor import
* Fix FuyuProcessor import
* Fix FuyuProcessor
* Add defaults for specific kwargs kosmos2
* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs
* Add tests processor Udop
* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature
* Fix overwrite tests kwargs processors
* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop
* Fix processing test fuyu
* remove unnecessary pad_token check in instructblip ProcessorTest
* Fix BC tests and cleanup
* FIx imports fuyu
* Uniformize Pix2Struct
* Fix wrong name for FuyuProcessorKwargs
* Fix slow tests reversed inputs align fuyu llava-next, change udop warning
* Fix wrong logging import udop
* Add check images text input order
* Fix copies
* change text pair handling when positional arg
* rebase on main, fix imports in test_processing_common
* remove optional args and udop uniformization from this PR
* fix failing tests
* remove unnecessary test, fix processing utils and test processing common
* cleanup Unpack
* cleanup
* fix conflict grounding dino
2024-09-24 21:28:19 -04:00
Pablo Montalvo
413008c580
add uniform processors for altclip + chinese_clip ( #31198 )
...
* add initial design for uniform processors + align model
* add uniform processors for altclip + chinese_clip
* fix mutable default 👀
* add configuration test
* handle structured kwargs w defaults + add test
* protect torch-specific test
* fix style
* fix
* rebase
* update processor to generic kwargs + test
* fix style
* add sensible kwargs merge
* update test
* fix assertEqual
* move kwargs merging to processing common
* rework kwargs for type hinting
* just get Unpack from extensions
* run-slow[align]
* handle kwargs passed as nested dict
* add from_pretrained test for nested kwargs handling
* [run-slow]align
* update documentation + imports
* update audio inputs
* protect audio types, silly
* try removing imports
* make things simpler
* simplerer
* move out kwargs test to common mixin
* [run-slow]align
* skip tests for old processors
* [run-slow]align, clip
* !$#@!! protect imports, darn it
* [run-slow]align, clip
* [run-slow]align, clip
* update common processor testing
* add altclip
* add chinese_clip
* add pad_size
* [run-slow]align, clip, chinese_clip, altclip
* remove duplicated tests
* fix
* update doc
* improve documentation for default values
* add model_max_length testing
This parameter depends on tokenizers received.
* Raise if kwargs are specified in two places
* fix
* match defaults
* force padding
* fix tokenizer test
* clean defaults
* move tests to common
* remove try/catch block
* deprecate kwarg
* format
* add copyright + remove unused method
* [run-slow]altclip, chinese_clip
* clean imports
* fix version
* clean up deprecation
* fix style
* add corner case test on kwarg overlap
* resume processing - add Unpack as importable
* add tmpdirname
* fix altclip
* fix up
* add back crop_size to specific tests
* generalize tests to possible video_processor
* add back crop_size arg
* fixup overlapping kwargs test for qformer_tokenizer
* remove copied from
* fixup chinese_clip tests values
* fixup tests - qformer tokenizers
* [run-slow] altclip, chinese_clip
* remove prepare_image_inputs
2024-09-19 17:21:54 +02:00
amyeroberts
d2dcff96f8
[InstructBLIP] qformer_tokenizer is required input ( #33222 )
...
* [InstructBLIP] qformer_tokenizer is required input
* Bit safer
* Add to instructblipvideo processor
* Fix up
* Use video inputs
* Update tests/models/instructblipvideo/test_processor_instructblipvideo.py
2024-09-04 16:18:06 +01:00
Raushan Turganbay
a29eabd0eb
Expand inputs in processors for VLMs ( #30962 )
...
* let it be
* draft
* should not have changed
* add warnings
* fix & add tests
* fix tests
* ipnuts embeds cannot be passed with pixels
* more updates
* paligemma ready!
* minor typos
* update blip-2
* fix tests & raise error
* docstring
* add blip2 test
* tmp
* add image seq length to config
* update docstring
* delete
* fix tests
* fix blip
* fix paligemma
* out-of-place scatter
* add llava-next-video
* Update src/transformers/models/blip_2/modeling_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
* remove tmp
* codestyle
* nits
* more nits
* remove overriding in tests
* comprehension when merging video
* fix-copies
* revert changes for embeds test
* fix tests after making comprehension
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
* Update src/transformers/models/blip_2/processing_blip_2.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
* more updates
* fix tests
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
2024-08-13 10:14:39 +05:00
Pavel Iakubovskii
fb66ef8147
Update kwargs validation for preprocess with decorator ( #32024 )
...
* BLIP preprocess
* BIT preprocess
* BRIDGETOWER preprocess
* CHAMELEON preprocess
* CHINESE_CLIP preprocess
* CONVNEXT preprocess
* DEIT preprocess
* DONUT preprocess
* DPT preprocess
* FLAVA preprocess
* EFFICIENTNET preprocess
* FUYU preprocess
* GLPN preprocess
* IMAGEGPT preprocess
* INTRUCTBLIPVIDEO preprocess
* VIVIT preprocess
* ZOEDEPTH preprocess
* VITMATTE preprocess
* VIT preprocess
* VILT preprocess
* VIDEOMAE preprocess
* VIDEOLLAVA
* TVP processing
* TVP fixup
* SWIN2SR preprocess
* SIGLIP preprocess
* SAM preprocess
* RT-DETR preprocess
* PVT preprocess
* POOLFORMER preprocess
* PERCEIVER preprocess
* OWLVIT preprocess
* OWLV2 preprocess
* NOUGAT preprocess
* MOBILEVIT preprocess
* MOBILENETV2 preprocess
* MOBILENETV1 preprocess
* LEVIT preprocess
* LAYOUTLMV2 preprocess
* LAYOUTLMV3 preprocess
* Add test
* Update tests
2024-08-06 11:33:05 +01:00
Raushan Turganbay
fc689d75a0
Add video modality for InstrucBLIP ( #30182 )
...
* squash in single commit
* add docs
* dummy obj
* more changes in diff converter
* tiny fix
* make docs happy
* skip test
* repo consistency tests
* update docstring
* style
* fix tests
* change diff imports
* [run-slow] instructblipvideo
* [run-slow] instructblipvideo
* fix tests and remove logit check
* [run-slow] instructblipvideo
2024-06-25 15:45:39 +05:00