Enable doc in Spanish (#16518)
* Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates
This commit is contained in:
155
docs/source/en/model_doc/roformer.mdx
Normal file
155
docs/source/en/model_doc/roformer.mdx
Normal file
@@ -0,0 +1,155 @@
|
||||
<!--Copyright 2021 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# RoFormer
|
||||
|
||||
## Overview
|
||||
|
||||
The RoFormer model was proposed in [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/pdf/2104.09864v1.pdf) by Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu.
|
||||
|
||||
The abstract from the paper is the following:
|
||||
|
||||
*Position encoding in transformer architecture provides supervision for dependency modeling between elements at
|
||||
different positions in the sequence. We investigate various methods to encode positional information in
|
||||
transformer-based language models and propose a novel implementation named Rotary Position Embedding(RoPE). The
|
||||
proposed RoPE encodes absolute positional information with rotation matrix and naturally incorporates explicit relative
|
||||
position dependency in self-attention formulation. Notably, RoPE comes with valuable properties such as flexibility of
|
||||
being expand to any sequence lengths, decaying inter-token dependency with increasing relative distances, and
|
||||
capability of equipping the linear self-attention with relative position encoding. As a result, the enhanced
|
||||
transformer with rotary position embedding, or RoFormer, achieves superior performance in tasks with long texts. We
|
||||
release the theoretical analysis along with some preliminary experiment results on Chinese data. The undergoing
|
||||
experiment for English benchmark will soon be updated.*
|
||||
|
||||
Tips:
|
||||
|
||||
- RoFormer is a BERT-like autoencoding model with rotary position embeddings. Rotary position embeddings have shown
|
||||
improved performance on classification tasks with long texts.
|
||||
|
||||
|
||||
This model was contributed by [junnyu](https://huggingface.co/junnyu). The original code can be found [here](https://github.com/ZhuiyiTechnology/roformer).
|
||||
|
||||
## RoFormerConfig
|
||||
|
||||
[[autodoc]] RoFormerConfig
|
||||
|
||||
## RoFormerTokenizer
|
||||
|
||||
[[autodoc]] RoFormerTokenizer
|
||||
- build_inputs_with_special_tokens
|
||||
- get_special_tokens_mask
|
||||
- create_token_type_ids_from_sequences
|
||||
- save_vocabulary
|
||||
|
||||
## RoFormerTokenizerFast
|
||||
|
||||
[[autodoc]] RoFormerTokenizerFast
|
||||
- build_inputs_with_special_tokens
|
||||
|
||||
## RoFormerModel
|
||||
|
||||
[[autodoc]] RoFormerModel
|
||||
- forward
|
||||
|
||||
## RoFormerForCausalLM
|
||||
|
||||
[[autodoc]] RoFormerForCausalLM
|
||||
- forward
|
||||
|
||||
## RoFormerForMaskedLM
|
||||
|
||||
[[autodoc]] RoFormerForMaskedLM
|
||||
- forward
|
||||
|
||||
## RoFormerForSequenceClassification
|
||||
|
||||
[[autodoc]] RoFormerForSequenceClassification
|
||||
- forward
|
||||
|
||||
## RoFormerForMultipleChoice
|
||||
|
||||
[[autodoc]] RoFormerForMultipleChoice
|
||||
- forward
|
||||
|
||||
## RoFormerForTokenClassification
|
||||
|
||||
[[autodoc]] RoFormerForTokenClassification
|
||||
- forward
|
||||
|
||||
## RoFormerForQuestionAnswering
|
||||
|
||||
[[autodoc]] RoFormerForQuestionAnswering
|
||||
- forward
|
||||
|
||||
## TFRoFormerModel
|
||||
|
||||
[[autodoc]] TFRoFormerModel
|
||||
- call
|
||||
|
||||
## TFRoFormerForMaskedLM
|
||||
|
||||
[[autodoc]] TFRoFormerForMaskedLM
|
||||
- call
|
||||
|
||||
## TFRoFormerForCausalLM
|
||||
|
||||
[[autodoc]] TFRoFormerForCausalLM
|
||||
- call
|
||||
|
||||
## TFRoFormerForSequenceClassification
|
||||
|
||||
[[autodoc]] TFRoFormerForSequenceClassification
|
||||
- call
|
||||
|
||||
## TFRoFormerForMultipleChoice
|
||||
|
||||
[[autodoc]] TFRoFormerForMultipleChoice
|
||||
- call
|
||||
|
||||
## TFRoFormerForTokenClassification
|
||||
|
||||
[[autodoc]] TFRoFormerForTokenClassification
|
||||
- call
|
||||
|
||||
## TFRoFormerForQuestionAnswering
|
||||
|
||||
[[autodoc]] TFRoFormerForQuestionAnswering
|
||||
- call
|
||||
|
||||
## FlaxRoFormerModel
|
||||
|
||||
[[autodoc]] FlaxRoFormerModel
|
||||
- __call__
|
||||
|
||||
## FlaxRoFormerForMaskedLM
|
||||
|
||||
[[autodoc]] FlaxRoFormerForMaskedLM
|
||||
- __call__
|
||||
|
||||
## FlaxRoFormerForSequenceClassification
|
||||
|
||||
[[autodoc]] FlaxRoFormerForSequenceClassification
|
||||
- __call__
|
||||
|
||||
## FlaxRoFormerForMultipleChoice
|
||||
|
||||
[[autodoc]] FlaxRoFormerForMultipleChoice
|
||||
- __call__
|
||||
|
||||
## FlaxRoFormerForTokenClassification
|
||||
|
||||
[[autodoc]] FlaxRoFormerForTokenClassification
|
||||
- __call__
|
||||
|
||||
## FlaxRoFormerForQuestionAnswering
|
||||
|
||||
[[autodoc]] FlaxRoFormerForQuestionAnswering
|
||||
- __call__
|
||||
Reference in New Issue
Block a user