[Docs] Model_doc structure/clarity improvements (#26876)

* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
2023-11-03 10:57:03 -04:00
parent ad8ff96224
commit 5964f820db
223 changed files with 1796 additions and 1116 deletions
--- a/docs/source/en/model_doc/fnet.md
+++ b/docs/source/en/model_doc/fnet.md
@@ -37,15 +37,15 @@ sequence lengths on GPUs (and across relatively shorter lengths on TPUs). Finall
 and is particularly efficient at smaller model sizes; for a fixed speed and accuracy budget, small FNet models
 outperform Transformer counterparts.*

-Tips on usage:
-
- The model was trained without an attention mask as it is based on Fourier Transform. The model was trained with
-  maximum sequence length 512 which includes pad tokens. Hence, it is highly recommended to use the same maximum
-  sequence length for fine-tuning and inference.
-
 This model was contributed by [gchhablani](https://huggingface.co/gchhablani). The original code can be found [here](https://github.com/google-research/google-research/tree/master/f_net).

-## Documentation resources
+## Usage tips
+
+The model was trained without an attention mask as it is based on Fourier Transform. The model was trained with 
+maximum sequence length 512 which includes pad tokens. Hence, it is highly recommended to use the same maximum 
+sequence length for fine-tuning and inference.
+
+## Resources

 - [Text classification task guide](../tasks/sequence_classification)
 - [Token classification task guide](../tasks/token_classification)