From ebb32261b19eaa258f998d2725116fe7a08224a6 Mon Sep 17 00:00:00 2001 From: LysandreJik Date: Wed, 2 Oct 2019 17:52:56 -0400 Subject: [PATCH] fix #1401 --- docs/source/pretrained_models.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/source/pretrained_models.rst b/docs/source/pretrained_models.rst index 4c17b35c84..c12a9bc52f 100644 --- a/docs/source/pretrained_models.rst +++ b/docs/source/pretrained_models.rst @@ -98,6 +98,12 @@ Here is the full list of the currently provided pretrained models together with | +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ | | ``xlm-clm-ende-1024`` | | 6-layer, 1024-hidden, 8-heads | | | | | XLM English-German model trained with CLM (Causal Language Modeling) on the concatenation of English and German wikipedia | +| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ +| | ``xlm-mlm-17-1280`` | | 16-layer, 1280-hidden, 16-heads | +| | | | XLM model trained with MLM (Masked Language Modeling) on 17 languages. | +| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ +| | ``xlm-mlm-100-1280`` | | 16-layer, 1280-hidden, 16-heads | +| | | | XLM model trained with MLM (Masked Language Modeling) on 100 languages. | +-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ | RoBERTa | ``roberta-base`` | | 12-layer, 768-hidden, 12-heads, 125M parameters | | | | | RoBERTa using the BERT-base architecture |