Add TFCLIPModel (#13967)

* Start the work for TFCLIPModel

* Convert to TF code (TODO: loss + doc)

* Clean up

* Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd

* assert -> raise error

* Expose TFCLIPModel

* Deal with dummy_inputs

* Add tests

* Fix all tests. TODO: manual check weight loading + add more comments

* Fix pt tf equivalence test

* fixes

* update TFCLIPVisionEmbeddings's Conv2D

* Fix loss + overwrite test_pt_tf_model_equivalence from common

* Add a comment about the change about MainLayer in test_keras_save_load

* Set return_loss=True in TFCLIPModelTester + make tests pass

* overwrite test_pt_tf_model_equivalence from tf common

* fix base_model_prefix

* Fix examples

* remove unused

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply review suggestions

* change self.pre_layrnorm to self.pre_layernorm

* apply more review suggestions

* return attention probs before dropout (to align with PT)

* fix weight init

* fix

* build doc

* fix missing doc

* fix for test

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
Yih-Dar
2021-12-23 17:19:44 +01:00
committed by GitHub
parent 2d30443cd3
commit 8f2cc1c3ab
13 changed files with 2448 additions and 4 deletions

View File

@@ -125,6 +125,23 @@ This model was contributed by [valhalla](https://huggingface.co/valhalla). The o
[[autodoc]] CLIPVisionModel
- forward
## TFCLIPModel
[[autodoc]] TFCLIPModel
- call
- get_text_features
- get_image_features
## TFCLIPTextModel
[[autodoc]] TFCLIPTextModel
- call
## TFCLIPVisionModel
[[autodoc]] TFCLIPVisionModel
- call
## FlaxCLIPModel
[[autodoc]] FlaxCLIPModel