AutoImageProcessor (#20111)
* AutoImageProcessor skeleton * Update references * Add mapping in init * Add model image processors to __init__ for importing * Add AutoImageProcessor tests * Fix up * Image Processor documentation * Remove pdb * Update docs/source/en/model_doc/mobilevit.mdx * Update docs * Don't add whitespace on json files * Remove fixtures * Move checking model config down * Fix up * Add check for image processor * Remove FeatureExtractorMixin in docstrings * Rename model_tmpfile to config_tmpfile * Don't make None if not in image processor map
This commit is contained in:
@@ -38,12 +38,12 @@ Tips:
|
||||
This processor wraps a feature extractor (for the image modality) and a tokenizer (for the language modality) into one.
|
||||
- ViLT is trained with images of various sizes: the authors resize the shorter edge of input images to 384 and limit the longer edge to
|
||||
under 640 while preserving the aspect ratio. To make batching of images possible, the authors use a `pixel_mask` that indicates
|
||||
which pixel values are real and which are padding. [`ViltProcessor`] automatically creates this for you.
|
||||
- The design of ViLT is very similar to that of a standard Vision Transformer (ViT). The only difference is that the model includes
|
||||
which pixel values are real and which are padding. [`ViltProcessor`] automatically creates this for you.
|
||||
- The design of ViLT is very similar to that of a standard Vision Transformer (ViT). The only difference is that the model includes
|
||||
additional embedding layers for the language modality.
|
||||
|
||||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/vilt_architecture.jpg"
|
||||
alt="drawing" width="600"/>
|
||||
alt="drawing" width="600"/>
|
||||
|
||||
<small> ViLT architecture. Taken from the <a href="https://arxiv.org/abs/2102.03334">original paper</a>. </small>
|
||||
|
||||
@@ -63,6 +63,11 @@ Tips:
|
||||
[[autodoc]] ViltFeatureExtractor
|
||||
- __call__
|
||||
|
||||
## ViltImageProcessor
|
||||
|
||||
[[autodoc]] ViltImageProcessor
|
||||
- preprocess
|
||||
|
||||
## ViltProcessor
|
||||
|
||||
[[autodoc]] ViltProcessor
|
||||
|
||||
Reference in New Issue
Block a user