[ViTMAE] Various fixes (#15221)

* Add MAE to AutoFeatureExtractor * Add link to notebook * Fix relative paths
2022-01-19 15:27:57 +01:00
parent 6d92c429c7
commit 842298f84f
4 changed files with 22 additions and 16 deletions
--- a/docs/source/model_doc/vit_mae.mdx
+++ b/docs/source/model_doc/vit_mae.mdx
@@ -32,6 +32,7 @@ Tips:

 - MAE (masked auto encoding) is a method for self-supervised pre-training of Vision Transformers (ViTs). The pre-training objective is relatively simple:
 by masking a large portion (75%) of the image patches, the model must reconstruct raw pixel values. One can use [`ViTMAEForPreTraining`] for this purpose.
+- A notebook that illustrates how to visualize reconstructed pixel values with [`ViTMAEForPreTraining`] can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/ViTMAE/ViT_MAE_visualization_demo.ipynb).
 - After pre-training, one "throws away" the decoder used to reconstruct pixels, and one uses the encoder for fine-tuning/linear probing. This means that after
 fine-tuning, one can directly plug in the weights into a [`ViTForImageClassification`].
 - One can use [`ViTFeatureExtractor`] to prepare images for the model. See the code examples for more info.