From 2380136722a3ad045831c6f5eaf4636951e04417 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Tue, 4 Jan 2022 16:13:57 -0500
Subject: [PATCH 01/24] add spaces badges
---
docs/source/model_summary.mdx | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 2966771cc0..8deb9b7b08 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -69,6 +69,9 @@ that at each position, the model can only look at the tokens before the attentio
+
+
+
[Improving Language Understanding by Generative Pre-Training](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf), Alec Radford et al.
@@ -85,6 +88,9 @@ classification.
+
+
+
[Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf),
Alec Radford et al.
@@ -103,6 +109,9 @@ classification.
+
+
+
[CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858),
Nitish Shirish Keskar et al.
From 59fb6369483c6b5e5c8bfe28c0c29412623e1668 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Thu, 6 Jan 2022 11:47:41 -0500
Subject: [PATCH 02/24] Transformer-XL badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 8deb9b7b08..70ef67866b 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -130,6 +130,9 @@ The library provides a version of the model for language modeling only.
+
+
+
[Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860), Zihang
Dai et al.
From 8d187e7feb7843b76cc0c113f3b9895268a88529 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Thu, 6 Jan 2022 11:59:21 -0500
Subject: [PATCH 03/24] Reformer Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 70ef67866b..2b57187b33 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -161,6 +161,9 @@ The library provides a version of the model for language modeling only.
+
+
+
[Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451), Nikita Kitaev et al .
From f872f18dca81ffbb7d45e202ba91549145c0d4e3 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Thu, 6 Jan 2022 12:09:50 -0500
Subject: [PATCH 04/24] XLNet spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 2b57187b33..64b2b95c26 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -199,6 +199,9 @@ The library provides a version of the model for language modeling only.
+
+
+
[XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237), Zhilin
Yang et al.
From 794441c379eeaa68bdc8e1b65772829aeedea98c Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Thu, 6 Jan 2022 12:22:09 -0500
Subject: [PATCH 05/24] BERT spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 64b2b95c26..ee7a844c67 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -234,6 +234,9 @@ corrupted versions.
+
+
+
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805),
Jacob Devlin et al.
From cac877425c142f4ae7b99f1601ab49f7c29ed56f Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Thu, 6 Jan 2022 13:01:23 -0500
Subject: [PATCH 06/24] ALBERT spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index ee7a844c67..90a52c88ca 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -263,6 +263,9 @@ token classification, sentence classification, multiple choice classification an
+
+
+
[ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942),
Zhenzhong Lan et al.
From 1d7122729543ce0e3926ed5e8182cdf88ab25380 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Thu, 6 Jan 2022 18:50:19 -0500
Subject: [PATCH 07/24] Roberta spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 90a52c88ca..f56c063313 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -292,6 +292,9 @@ classification, multiple choice classification and question answering.
+
+
+
[RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692), Yinhan Liu et al.
From 484e7a441f48c5b57aeca9daad0585d87d4f8331 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Fri, 7 Jan 2022 11:47:56 -0500
Subject: [PATCH 08/24] Distilbert spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index f56c063313..370f2346b5 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -317,6 +317,9 @@ classification, multiple choice classification and question answering.
+
+
+
[DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108),
Victor Sanh et al.
From 16b6df6fca891e353dd0feeb9fb0020bd7cd5d75 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 10:33:03 -0500
Subject: [PATCH 09/24] ConvBERT spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 370f2346b5..b680494525 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -342,6 +342,9 @@ and question answering.
+
+
+
[ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://arxiv.org/abs/2008.02496), Zihang Jiang,
Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan.
From 20fa9eb0353cb20f5e72dc9080bf98a9d7a9eacf Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 10:48:06 -0500
Subject: [PATCH 10/24] XLM Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index b680494525..ede61e8d15 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -372,6 +372,9 @@ and question answering.
+
+
+
[Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291), Guillaume Lample and Alexis Conneau
From 9f331168988d63d1b88e205d26a8b01a0e072f25 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 10:54:18 -0500
Subject: [PATCH 11/24] XLM-Roberta Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index ede61e8d15..5879346b32 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -406,6 +406,9 @@ question answering.
+
+
+
[Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116), Alexis Conneau et
al.
From 84f360e86201db50e7e3cb4ed027a54e549a08bb Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 11:41:10 -0500
Subject: [PATCH 12/24] FlauBERT spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 5879346b32..f97520f019 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -428,6 +428,9 @@ classification, multiple choice classification and question answering.
+
+
+
[FlauBERT: Unsupervised Language Model Pre-training for French](https://arxiv.org/abs/1912.05372), Hang Le et al.
From 222c09a635eb5985b1b7e418d4579094cede13db Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 11:53:23 -0500
Subject: [PATCH 13/24] ELECTRA Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index f97520f019..4d9f23fea1 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -446,6 +446,9 @@ The library provides a version of the model for language modeling and sentence c
+
+
+
[ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555),
Kevin Clark et al.
From 4fbc924d0a3e8a92e33e5a76d7d507e87cd08f21 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 12:06:05 -0500
Subject: [PATCH 14/24] Funnel Transformer spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 4d9f23fea1..bcc942a096 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -470,6 +470,9 @@ classification.
+
+
+
[Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236), Zihang Dai et al.
From 20f169b523dec3779eb127aa33bed4843f57a8b1 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 12:14:18 -0500
Subject: [PATCH 15/24] Longformer Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index bcc942a096..e77550ab9a 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -503,6 +503,9 @@ classification, multiple choice classification and question answering.
+
+
+
[Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150), Iz Beltagy et al.
From 03f8b9c9e03b444d00ed0a7efd49fd37bc82dff1 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 12:33:59 -0500
Subject: [PATCH 16/24] BART Spaces badge
---
docs/source/model_summary.mdx | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index e77550ab9a..6123bc2bca 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -542,6 +542,10 @@ As mentioned before, these models keep both the encoder and the decoder of the o
+
+
+
+
[BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461), Mike Lewis et al.
From 7ec6aad23d30c1672a6fc448e458279a8df32e30 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 12:39:22 -0500
Subject: [PATCH 17/24] Pegasus Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 6123bc2bca..232bb93a38 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -569,6 +569,9 @@ The library provides a version of this model for conditional generation and sequ
+
+
+
[PEGASUS: Pre-training with Extracted Gap-sentences forAbstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf), Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu on Dec 18, 2019.
From 0554e4d5c56c11f90312c83389b4791475caf1b5 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 12:47:12 -0500
Subject: [PATCH 18/24] MarianMT Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 232bb93a38..4542f541e1 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -599,6 +599,9 @@ The library provides a version of this model for conditional generation, which s
+
+
+
[Marian: Fast Neural Machine Translation in C++](https://arxiv.org/abs/1804.00344), Marcin Junczys-Dowmunt et al.
From daec528ca9a6482bd35aa01e2be82a3e0a8882b7 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 12:51:39 -0500
Subject: [PATCH 19/24] T5 Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 4542f541e1..215b572f60 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -618,6 +618,9 @@ The library provides a version of this model for conditional generation.
+
+
+
[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683), Colin Raffel et al.
From c9504b2f50606df38a0573df2429a530cacaee30 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 12:57:08 -0500
Subject: [PATCH 20/24] MT5 Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 215b572f60..380909453a 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -650,6 +650,9 @@ The library provides a version of this model for conditional generation.
+
+
+
[mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934), Linting Xue
et al.
From bf0201e184d2ba428db22d05a240e4be585602ef Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 13:37:17 -0500
Subject: [PATCH 21/24] MBART spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 380909453a..39fc87d262 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -671,6 +671,9 @@ The library provides a version of this model for conditional generation.
+
+
+
[Multilingual Denoising Pre-training for Neural Machine Translation](https://arxiv.org/abs/2001.08210) by Yinhan Liu,
Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
From ac2c06d492639894c126cb31424b33ea043fabae Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 13:43:34 -0500
Subject: [PATCH 22/24] ProphetNet spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 39fc87d262..dbfa87aad7 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -700,6 +700,9 @@ finetuning.
+
+
+
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.
From 4e3208662ec8fe96e5a16145cc95255cc2eab327 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Mon, 10 Jan 2022 13:50:40 -0500
Subject: [PATCH 23/24] DPR Spaces badge
---
docs/source/model_summary.mdx | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index dbfa87aad7..84e17b103b 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -775,6 +775,10 @@ Some models use documents retrieval during (pre)training and inference for open-
+
+
+
+
[Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/abs/2004.04906), Vladimir Karpukhin et
al.
From 5cd7086fdb702f7120c938d6f34ac34b148bf7e1 Mon Sep 17 00:00:00 2001
From: AK391 <81195143+AK391@users.noreply.github.com>
Date: Tue, 11 Jan 2022 00:11:31 -0500
Subject: [PATCH 24/24] XLM-ProphetNet Spaces badge
---
docs/source/model_summary.mdx | 3 +++
1 file changed, 3 insertions(+)
diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 84e17b103b..e75a41056b 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -725,6 +725,9 @@ summarization.
+
+
+
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.