[TensorFlow] Adding GroupViT (#18020)
* chore: initial commit * chore: adding util methods yet to work on the nn.functional.interpolate port with align_corener=True * chore: refactor the utils * used tf.compat.v1.image.resize to align the F.interpolate function * added type hints to the method signatures * added references to the gists where one 2 one alignment of torch and tf has been shown * chore: adding the layers * chore: porting all the layers from torch to tf This is the initial draft, nothing is tested yet. * chore: aligning the layers with reference to tf clip * chore: aligning the modules * added demaraction comments * added copied and adapted from comments * chore: aligning with CLIP * chore: wrangling the layers to keep it tf compatible * chore: aligning the names of the layers for porting * chore: style changes * chore: adding docs and inits * chore: adding tfp dependencis the code is taken from TAPAS * chore: initial commit for testing * chore: aligning the vision embeddings with the vit implementatino * chore: changing model prefix * chore: fixing the name of the model and the layer normalization test case * chore: every test passes but the slow ones * chore: fix style and integration test * chore: moving comments below decorators * chore: make fixup and fix-copies changes * chore: adding the Vision and Text Model to check_repo * chore: modifying the prefix name to align it with the torch implementation * chore: fix typo in configuration * choer: changing the name of the model variable * chore: adding segmentation flag * chore: gante's review * chore: style refactor * chore: amy review * chore: adding shape_list to parts that have been copied from other snippets * chore: init batchnorm with torch defaults * chore: adding shape_list to pass the tests * test fix: adding seed as 0 * set seed * chore: changing the straight through trick to fix -ve dimensinos * chore: adding a dimension to the loss * chore: adding reviewers and contributors names to the docs * chore: added changes after review * chore: code quality fixup * chore: fixing the segmentation snippet * chore: adding to the layer calls * chore: changing int32 to int64 for inputs of serving * chore: review changes * chore: style changes * chore: remove from_pt=True * fix: repo consistency Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
bb6fa06f2d
commit
0dc7b3a785
@@ -26,7 +26,7 @@ Tips:
|
||||
- You may specify `output_segmentation=True` in the forward of `GroupViTModel` to get the segmentation logits of input texts.
|
||||
- The quickest way to get started with GroupViT is by checking the [example notebooks](https://github.com/xvjiarui/GroupViT/blob/main/demo/GroupViT_hf_inference_notebook.ipynb) (which showcase zero-shot segmentation inference). One can also check out the [HuggingFace Spaces demo](https://huggingface.co/spaces/xvjiarui/GroupViT) to play with GroupViT.
|
||||
|
||||
This model was contributed by [xvjiarui](https://huggingface.co/xvjiarui).
|
||||
This model was contributed by [xvjiarui](https://huggingface.co/xvjiarui). The TensorFlow version was contributed by [ariG23498](https://huggingface.co/ariG23498) with the help of [Yih-Dar SHIEH](https://huggingface.co/ydshieh), [Amy Roberts](https://huggingface.co/amyeroberts), and [Joao Gante](https://huggingface.co/joaogante).
|
||||
The original code can be found [here](https://github.com/NVlabs/GroupViT).
|
||||
|
||||
|
||||
@@ -59,3 +59,20 @@ The original code can be found [here](https://github.com/NVlabs/GroupViT).
|
||||
|
||||
[[autodoc]] GroupViTVisionModel
|
||||
- forward
|
||||
|
||||
## TFGroupViTModel
|
||||
|
||||
[[autodoc]] TFGroupViTModel
|
||||
- call
|
||||
- get_text_features
|
||||
- get_image_features
|
||||
|
||||
## TFGroupViTTextModel
|
||||
|
||||
[[autodoc]] TFGroupViTTextModel
|
||||
- call
|
||||
|
||||
## TFGroupViTVisionModel
|
||||
|
||||
[[autodoc]] TFGroupViTVisionModel
|
||||
- call
|
||||
Reference in New Issue
Block a user