[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
This commit is contained in:
@@ -31,7 +31,9 @@ teacher learning and contrastive learning. We validate our method through evalua
|
||||
performances on a bunch of tasks including ImageNet-CN, Flicker30k- CN, and COCO-CN. Further, we obtain very close performances with
|
||||
CLIP on almost all tasks, suggesting that one can simply alter the text encoder in CLIP for extended capabilities such as multilingual understanding.*
|
||||
|
||||
## Usage
|
||||
This model was contributed by [jongjyh](https://huggingface.co/jongjyh).
|
||||
|
||||
## Usage tips and example
|
||||
|
||||
The usage of AltCLIP is very similar to the CLIP. the difference between CLIP is the text encoder. Note that we use bidirectional attention instead of casual attention
|
||||
and we take the [CLS] token in XLM-R to represent text embedding.
|
||||
@@ -50,7 +52,6 @@ The [`AltCLIPProcessor`] wraps a [`CLIPImageProcessor`] and a [`XLMRobertaTokeni
|
||||
encode the text and prepare the images. The following example shows how to get the image-text similarity scores using
|
||||
[`AltCLIPProcessor`] and [`AltCLIPModel`].
|
||||
|
||||
|
||||
```python
|
||||
>>> from PIL import Image
|
||||
>>> import requests
|
||||
@@ -70,11 +71,11 @@ encode the text and prepare the images. The following example shows how to get t
|
||||
>>> probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
|
||||
```
|
||||
|
||||
Tips:
|
||||
<Tip>
|
||||
|
||||
This model is build on `CLIPModel`, so use it like a original CLIP.
|
||||
This model is based on `CLIPModel`, use it like you would use the original [CLIP](clip).
|
||||
|
||||
This model was contributed by [jongjyh](https://huggingface.co/jongjyh).
|
||||
</Tip>
|
||||
|
||||
## AltCLIPConfig
|
||||
|
||||
|
||||
Reference in New Issue
Block a user