diff --git a/docs/source/model_summary.mdx b/docs/source/model_summary.mdx
index 64b2b95c26..ee7a844c67 100644
--- a/docs/source/model_summary.mdx
+++ b/docs/source/model_summary.mdx
@@ -234,6 +234,9 @@ corrupted versions.
+
+
+
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805),
Jacob Devlin et al.