Model summary horizontal banners (#15058)
This commit is contained in:
@@ -63,12 +63,14 @@ that at each position, the model can only look at the tokens before the attentio
|
|||||||
|
|
||||||
### Original GPT
|
### Original GPT
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=openai-gpt">
|
<a href="https://huggingface.co/models?filter=openai-gpt">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-openai--gpt-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-openai--gpt-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/openai-gpt">
|
<a href="model_doc/openai-gpt">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-openai--gpt-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-openai--gpt-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Improving Language Understanding by Generative Pre-Training](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf), Alec Radford et al.
|
[Improving Language Understanding by Generative Pre-Training](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf), Alec Radford et al.
|
||||||
|
|
||||||
@@ -79,12 +81,14 @@ classification.
|
|||||||
|
|
||||||
### GPT-2
|
### GPT-2
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=gpt2">
|
<a href="https://huggingface.co/models?filter=gpt2">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-gpt2-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-gpt2-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/gpt2">
|
<a href="model_doc/gpt2">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-gpt2-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-gpt2-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf),
|
[Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf),
|
||||||
Alec Radford et al.
|
Alec Radford et al.
|
||||||
@@ -97,12 +101,14 @@ classification.
|
|||||||
|
|
||||||
### CTRL
|
### CTRL
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=ctrl">
|
<a href="https://huggingface.co/models?filter=ctrl">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-ctrl-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-ctrl-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/ctrl">
|
<a href="model_doc/ctrl">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-ctrl-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-ctrl-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858),
|
[CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858),
|
||||||
Nitish Shirish Keskar et al.
|
Nitish Shirish Keskar et al.
|
||||||
@@ -115,12 +121,14 @@ The library provides a version of the model for language modeling only.
|
|||||||
|
|
||||||
### Transformer-XL
|
### Transformer-XL
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=transfo-xl">
|
<a href="https://huggingface.co/models?filter=transfo-xl">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-transfo--xl-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-transfo--xl-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/transfo-xl">
|
<a href="model_doc/transfo-xl">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-transfo--xl-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-transfo--xl-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860), Zihang
|
[Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860), Zihang
|
||||||
Dai et al.
|
Dai et al.
|
||||||
@@ -143,12 +151,14 @@ The library provides a version of the model for language modeling only.
|
|||||||
|
|
||||||
### Reformer
|
### Reformer
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=reformer">
|
<a href="https://huggingface.co/models?filter=reformer">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-reformer-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-reformer-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/reformer">
|
<a href="model_doc/reformer">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-reformer-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-reformer-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451), Nikita Kitaev et al .
|
[Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451), Nikita Kitaev et al .
|
||||||
|
|
||||||
@@ -178,12 +188,14 @@ The library provides a version of the model for language modeling only.
|
|||||||
|
|
||||||
### XLNet
|
### XLNet
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=xlnet">
|
<a href="https://huggingface.co/models?filter=xlnet">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlnet-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlnet-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/xlnet">
|
<a href="model_doc/xlnet">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlnet-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlnet-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237), Zhilin
|
[XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237), Zhilin
|
||||||
Yang et al.
|
Yang et al.
|
||||||
@@ -210,12 +222,14 @@ corrupted versions.
|
|||||||
|
|
||||||
### BERT
|
### BERT
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=bert">
|
<a href="https://huggingface.co/models?filter=bert">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-bert-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-bert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/bert">
|
<a href="model_doc/bert">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bert-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805),
|
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805),
|
||||||
Jacob Devlin et al.
|
Jacob Devlin et al.
|
||||||
@@ -236,12 +250,14 @@ token classification, sentence classification, multiple choice classification an
|
|||||||
|
|
||||||
### ALBERT
|
### ALBERT
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=albert">
|
<a href="https://huggingface.co/models?filter=albert">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-albert-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-albert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/albert">
|
<a href="model_doc/albert">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-albert-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-albert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942),
|
[ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942),
|
||||||
Zhenzhong Lan et al.
|
Zhenzhong Lan et al.
|
||||||
@@ -262,12 +278,14 @@ classification, multiple choice classification and question answering.
|
|||||||
|
|
||||||
### RoBERTa
|
### RoBERTa
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=roberta">
|
<a href="https://huggingface.co/models?filter=roberta">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-roberta-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-roberta-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/roberta">
|
<a href="model_doc/roberta">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-roberta-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-roberta-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692), Yinhan Liu et al.
|
[RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692), Yinhan Liu et al.
|
||||||
|
|
||||||
@@ -284,12 +302,14 @@ classification, multiple choice classification and question answering.
|
|||||||
|
|
||||||
### DistilBERT
|
### DistilBERT
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=distilbert">
|
<a href="https://huggingface.co/models?filter=distilbert">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-distilbert-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-distilbert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/distilbert">
|
<a href="model_doc/distilbert">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-distilbert-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-distilbert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108),
|
[DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108),
|
||||||
Victor Sanh et al.
|
Victor Sanh et al.
|
||||||
@@ -306,12 +326,14 @@ and question answering.
|
|||||||
|
|
||||||
### ConvBERT
|
### ConvBERT
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=convbert">
|
<a href="https://huggingface.co/models?filter=convbert">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-convbert-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-convbert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/convbert">
|
<a href="model_doc/convbert">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-convbert-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-convbert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://arxiv.org/abs/2008.02496), Zihang Jiang,
|
[ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://arxiv.org/abs/2008.02496), Zihang Jiang,
|
||||||
Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan.
|
Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan.
|
||||||
@@ -333,12 +355,14 @@ and question answering.
|
|||||||
|
|
||||||
### XLM
|
### XLM
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=xlm">
|
<a href="https://huggingface.co/models?filter=xlm">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/xlm">
|
<a href="model_doc/xlm">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291), Guillaume Lample and Alexis Conneau
|
[Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291), Guillaume Lample and Alexis Conneau
|
||||||
|
|
||||||
@@ -364,12 +388,14 @@ question answering.
|
|||||||
|
|
||||||
### XLM-RoBERTa
|
### XLM-RoBERTa
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=xlm-roberta">
|
<a href="https://huggingface.co/models?filter=xlm-roberta">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm--roberta-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm--roberta-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/xlm-roberta">
|
<a href="model_doc/xlm-roberta">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm--roberta-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm--roberta-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116), Alexis Conneau et
|
[Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116), Alexis Conneau et
|
||||||
al.
|
al.
|
||||||
@@ -383,12 +409,14 @@ classification, multiple choice classification and question answering.
|
|||||||
|
|
||||||
### FlauBERT
|
### FlauBERT
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=flaubert">
|
<a href="https://huggingface.co/models?filter=flaubert">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-flaubert-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-flaubert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/flaubert">
|
<a href="model_doc/flaubert">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-flaubert-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-flaubert-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[FlauBERT: Unsupervised Language Model Pre-training for French](https://arxiv.org/abs/1912.05372), Hang Le et al.
|
[FlauBERT: Unsupervised Language Model Pre-training for French](https://arxiv.org/abs/1912.05372), Hang Le et al.
|
||||||
|
|
||||||
@@ -398,12 +426,14 @@ The library provides a version of the model for language modeling and sentence c
|
|||||||
|
|
||||||
### ELECTRA
|
### ELECTRA
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=electra">
|
<a href="https://huggingface.co/models?filter=electra">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-electra-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-electra-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/electra">
|
<a href="model_doc/electra">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-electra-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-electra-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555),
|
[ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555),
|
||||||
Kevin Clark et al.
|
Kevin Clark et al.
|
||||||
@@ -419,12 +449,14 @@ classification.
|
|||||||
|
|
||||||
### Funnel Transformer
|
### Funnel Transformer
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=funnel">
|
<a href="https://huggingface.co/models?filter=funnel">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-funnel-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-funnel-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/funnel">
|
<a href="model_doc/funnel">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-funnel-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-funnel-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236), Zihang Dai et al.
|
[Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236), Zihang Dai et al.
|
||||||
|
|
||||||
@@ -449,12 +481,14 @@ classification, multiple choice classification and question answering.
|
|||||||
|
|
||||||
### Longformer
|
### Longformer
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=longformer">
|
<a href="https://huggingface.co/models?filter=longformer">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-longformer-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-longformer-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/longformer">
|
<a href="model_doc/longformer">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-longformer-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-longformer-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150), Iz Beltagy et al.
|
[Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150), Iz Beltagy et al.
|
||||||
|
|
||||||
@@ -485,12 +519,14 @@ As mentioned before, these models keep both the encoder and the decoder of the o
|
|||||||
|
|
||||||
### BART
|
### BART
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=bart">
|
<a href="https://huggingface.co/models?filter=bart">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-bart-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-bart-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/bart">
|
<a href="model_doc/bart">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bart-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bart-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461), Mike Lewis et al.
|
[BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461), Mike Lewis et al.
|
||||||
|
|
||||||
@@ -508,12 +544,14 @@ The library provides a version of this model for conditional generation and sequ
|
|||||||
|
|
||||||
### Pegasus
|
### Pegasus
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=pegasus">
|
<a href="https://huggingface.co/models?filter=pegasus">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-pegasus-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-pegasus-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/pegasus">
|
<a href="model_doc/pegasus">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-pegasus-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-pegasus-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[PEGASUS: Pre-training with Extracted Gap-sentences forAbstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf), Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu on Dec 18, 2019.
|
[PEGASUS: Pre-training with Extracted Gap-sentences forAbstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf), Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu on Dec 18, 2019.
|
||||||
|
|
||||||
@@ -535,12 +573,14 @@ The library provides a version of this model for conditional generation, which s
|
|||||||
|
|
||||||
### MarianMT
|
### MarianMT
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=marian">
|
<a href="https://huggingface.co/models?filter=marian">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-marian-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-marian-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/marian">
|
<a href="model_doc/marian">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-marian-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-marian-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Marian: Fast Neural Machine Translation in C++](https://arxiv.org/abs/1804.00344), Marcin Junczys-Dowmunt et al.
|
[Marian: Fast Neural Machine Translation in C++](https://arxiv.org/abs/1804.00344), Marcin Junczys-Dowmunt et al.
|
||||||
|
|
||||||
@@ -551,12 +591,14 @@ The library provides a version of this model for conditional generation.
|
|||||||
|
|
||||||
### T5
|
### T5
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=t5">
|
<a href="https://huggingface.co/models?filter=t5">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-t5-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-t5-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/t5">
|
<a href="model_doc/t5">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-t5-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-t5-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683), Colin Raffel et al.
|
[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683), Colin Raffel et al.
|
||||||
|
|
||||||
@@ -580,12 +622,14 @@ The library provides a version of this model for conditional generation.
|
|||||||
|
|
||||||
### MT5
|
### MT5
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=mt5">
|
<a href="https://huggingface.co/models?filter=mt5">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-mt5-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-mt5-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/mt5">
|
<a href="model_doc/mt5">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mt5-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mt5-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934), Linting Xue
|
[mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934), Linting Xue
|
||||||
et al.
|
et al.
|
||||||
@@ -598,12 +642,14 @@ The library provides a version of this model for conditional generation.
|
|||||||
|
|
||||||
### MBart
|
### MBart
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=mbart">
|
<a href="https://huggingface.co/models?filter=mbart">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-mbart-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-mbart-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/mbart">
|
<a href="model_doc/mbart">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mbart-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mbart-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Multilingual Denoising Pre-training for Neural Machine Translation](https://arxiv.org/abs/2001.08210) by Yinhan Liu,
|
[Multilingual Denoising Pre-training for Neural Machine Translation](https://arxiv.org/abs/2001.08210) by Yinhan Liu,
|
||||||
Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
|
Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
|
||||||
@@ -624,12 +670,14 @@ finetuning.
|
|||||||
|
|
||||||
### ProphetNet
|
### ProphetNet
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=prophetnet">
|
<a href="https://huggingface.co/models?filter=prophetnet">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-prophetnet-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-prophetnet-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/prophetnet">
|
<a href="model_doc/prophetnet">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-prophetnet-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-prophetnet-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
|
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
|
||||||
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.
|
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.
|
||||||
@@ -646,12 +694,14 @@ summarization.
|
|||||||
|
|
||||||
### XLM-ProphetNet
|
### XLM-ProphetNet
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=xprophetnet">
|
<a href="https://huggingface.co/models?filter=xprophetnet">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xprophetnet-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-xprophetnet-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/xlm-prophetnet">
|
<a href="model_doc/xlm-prophetnet">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xprophetnet-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xprophetnet-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
|
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
|
||||||
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.
|
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.
|
||||||
@@ -696,12 +746,14 @@ Some models use documents retrieval during (pre)training and inference for open-
|
|||||||
|
|
||||||
### DPR
|
### DPR
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=dpr">
|
<a href="https://huggingface.co/models?filter=dpr">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-dpr-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-dpr-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/dpr">
|
<a href="model_doc/dpr">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-dpr-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-dpr-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/abs/2004.04906), Vladimir Karpukhin et
|
[Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/abs/2004.04906), Vladimir Karpukhin et
|
||||||
al.
|
al.
|
||||||
@@ -722,12 +774,14 @@ then it calls the reader with the question and the retrieved documents to get th
|
|||||||
|
|
||||||
### RAG
|
### RAG
|
||||||
|
|
||||||
|
<div class="flex flex-wrap space-x-1">
|
||||||
<a href="https://huggingface.co/models?filter=rag">
|
<a href="https://huggingface.co/models?filter=rag">
|
||||||
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-rag-blueviolet">
|
<img alt="Models" src="https://img.shields.io/badge/All_model_pages-rag-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
<a href="model_doc/rag">
|
<a href="model_doc/rag">
|
||||||
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-rag-blueviolet">
|
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-rag-blueviolet">
|
||||||
</a>
|
</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
[Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://arxiv.org/abs/2005.11401), Patrick Lewis,
|
[Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://arxiv.org/abs/2005.11401), Patrick Lewis,
|
||||||
Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau
|
Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau
|
||||||
|
|||||||
Reference in New Issue
Block a user