From dd7a958fd6963d09850ad4842307d1d1064d096d Mon Sep 17 00:00:00 2001 From: Stefan Schweter Date: Wed, 18 Dec 2019 19:45:46 +0100 Subject: [PATCH] docs: add XLM-RoBERTa to pretrained model list (incl. all parameters) --- docs/source/pretrained_models.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/source/pretrained_models.rst b/docs/source/pretrained_models.rst index 7d037da34f..a359990f5a 100644 --- a/docs/source/pretrained_models.rst +++ b/docs/source/pretrained_models.rst @@ -240,6 +240,12 @@ Here is the full list of the currently provided pretrained models together with | | ``t5-11B`` | | ~11B parameters with 24-layers, 1024-hidden-state, 65536 feed-forward hidden-state, 128-heads, | | | | | Trained on English text: the Colossal Clean Crawled Corpus (C4) | +-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ +| XLM-RoBERTa | ``xlm-roberta-base`` | | ~125M parameters with 12-layers, 768-hidden-state, 3072 feed-forward hidden-state, 8-heads, | +| | | | Trained on on 2.5 TB of newly created clean CommonCrawl data in 100 languages | +| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ +| | ``xlm-roberta-large`` | | ~355M parameters with 24-layers, 1027-hidden-state, 4096 feed-forward hidden-state, 16-heads, | +| | | | Trained on 2.5 TB of newly created clean CommonCrawl data in 100 languages | ++-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ .. `__