Add GIT (GenerativeImage2Text) (#20295)

* First draft

* Make model instantiation work

* Fix copied from statement

* More fixes

* Add correct output head

* Improve configuration

* Add conversion script

* Improve conversion script

* Remove token_type_ids

* Fix conversion of projection layers

* Convert all weights

* Use cats image

* Make logits match

* Generate caption on cats image

* Add GITProcessor

* Update conversion script

* Add support for more checkpoints

* Fix conversion script

* Add initial tests

* Remove cross-attention

* More improvements

* Remove is_decoder

* Improve model tests

* Improve tests

* Improve model outputs

* Fix model outputs equivalence

* Fix more tests

* Remove unused code

* Use generate to generate text, no use of cache for now

* Use generate more appropriately

* Fix config tests

* Fix style

* Add support for use_cache

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix style

* Fix GIT vision encoder

* Update README

* Fix integration test

* Set bos and eos token ids

* Improve docs

* Improve code

* Add support for provided attention_mask

* Add copied from statement

* Fix gradient checkpointing test

* Set model_input_names

* Investigate model_input_names

* Remove script

* Fix model inputs

* Fix docstring

* Rename GIT to Git

* Support more models

* Add support for textvqa model

* Add video support

* Extend conversion script for video

* Add support for large variant

* Add support for more models

* Fix config archive map

* Update integration test

* Fix README

* Fix CLIP mean and std

* Update processor

* Fix use_cache for video, thanks @gante

* Remove print statements

* Remove assertion

* Add processor tests

* Fix model_input_names

* Use Auto API for processor

* Fix processor tests

* Fix integration test

* Fix pipeline test

* Make tests faster

* Update conversion script

* Update conversion script

* Convert more checkpoints

* Update conversion script

* Fix typo

* Update docstrings

* Improve code snippets

* Fix doc tests

* Add more code examplesé

* Fix doc tests

* Add integration tests

* Fix unused variable

* revert

* Add GIT to Japanese README

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This commit is contained in:
NielsRogge
2023-01-03 14:17:18 +01:00
committed by GitHub
parent 305f41e4de
commit 9c6f7485a6
32 changed files with 3150 additions and 16 deletions

View File

@@ -212,7 +212,11 @@ class TextGenerationPipelineTests(unittest.TestCase, metaclass=PipelineTestCaseM
# it requires BOS token to exist.
# Special case for Pegasus which will always append EOS so will
# work even without BOS.
if text_generator.tokenizer.bos_token_id is not None or "Pegasus" in tokenizer.__class__.__name__:
if (
text_generator.tokenizer.bos_token_id is not None
or "Pegasus" in tokenizer.__class__.__name__
or "Git" in model.__class__.__name__
):
outputs = text_generator("")
self.assertEqual(outputs, [{"generated_text": ANY(str)}])
else: