[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
This commit is contained in:
@@ -32,7 +32,9 @@ simpler. Further analysis demonstrates that VisualBERT can ground elements of la
|
||||
explicit supervision and is even sensitive to syntactic relationships, tracking, for example, associations between
|
||||
verbs and image regions corresponding to their arguments.*
|
||||
|
||||
Tips:
|
||||
This model was contributed by [gchhablani](https://huggingface.co/gchhablani). The original code can be found [here](https://github.com/uclanlp/visualbert).
|
||||
|
||||
## Usage tips
|
||||
|
||||
1. Most of the checkpoints provided work with the [`VisualBertForPreTraining`] configuration. Other
|
||||
checkpoints provided are the fine-tuned checkpoints for down-stream tasks - VQA ('visualbert-vqa'), VCR
|
||||
@@ -43,8 +45,6 @@ Tips:
|
||||
We do not provide the detector and its weights as a part of the package, but it will be available in the research
|
||||
projects, and the states can be loaded directly into the detector provided.
|
||||
|
||||
## Usage
|
||||
|
||||
VisualBERT is a multi-modal vision and language model. It can be used for visual question answering, multiple choice,
|
||||
visual reasoning and region-to-phrase correspondence tasks. VisualBERT uses a BERT-like transformer to prepare
|
||||
embeddings for image-text pairs. Both the text and visual features are then projected to a latent space with identical
|
||||
@@ -92,8 +92,6 @@ The following example shows how to get the last hidden state using [`VisualBertM
|
||||
>>> last_hidden_state = outputs.last_hidden_state
|
||||
```
|
||||
|
||||
This model was contributed by [gchhablani](https://huggingface.co/gchhablani). The original code can be found [here](https://github.com/uclanlp/visualbert).
|
||||
|
||||
## VisualBertConfig
|
||||
|
||||
[[autodoc]] VisualBertConfig
|
||||
|
||||
Reference in New Issue
Block a user