[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
This commit is contained in:
@@ -37,15 +37,15 @@ sequence lengths on GPUs (and across relatively shorter lengths on TPUs). Finall
|
||||
and is particularly efficient at smaller model sizes; for a fixed speed and accuracy budget, small FNet models
|
||||
outperform Transformer counterparts.*
|
||||
|
||||
Tips on usage:
|
||||
|
||||
- The model was trained without an attention mask as it is based on Fourier Transform. The model was trained with
|
||||
maximum sequence length 512 which includes pad tokens. Hence, it is highly recommended to use the same maximum
|
||||
sequence length for fine-tuning and inference.
|
||||
|
||||
This model was contributed by [gchhablani](https://huggingface.co/gchhablani). The original code can be found [here](https://github.com/google-research/google-research/tree/master/f_net).
|
||||
|
||||
## Documentation resources
|
||||
## Usage tips
|
||||
|
||||
The model was trained without an attention mask as it is based on Fourier Transform. The model was trained with
|
||||
maximum sequence length 512 which includes pad tokens. Hence, it is highly recommended to use the same maximum
|
||||
sequence length for fine-tuning and inference.
|
||||
|
||||
## Resources
|
||||
|
||||
- [Text classification task guide](../tasks/sequence_classification)
|
||||
- [Token classification task guide](../tasks/token_classification)
|
||||
|
||||
Reference in New Issue
Block a user