[model_cards] Add language metadata to existing model cards

This will enable filtering on language (amongst other tags) on the website

cc @loretoparisi, @stefan-it, @HenrykBorzymowski, @marma
This commit is contained in:
Julien Chaumond
2020-02-10 17:42:42 -05:00
parent ba498eac38
commit 95bac8dabb
13 changed files with 49 additions and 1 deletions

View File

@@ -1,3 +1,7 @@
---
language: swedish
---
# Swedish BERT Models # Swedish BERT Models
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on. The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.

View File

@@ -1,3 +1,7 @@
---
language: swedish
---
# Swedish BERT Models # Swedish BERT Models
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on. The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.

View File

@@ -1,3 +1,7 @@
---
language: swedish
---
# Swedish BERT Models # Swedish BERT Models
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on. The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.

View File

@@ -1,3 +1,7 @@
---
language: italian
---
# UmBERTo Commoncrawl Cased # UmBERTo Commoncrawl Cased
[UmBERTo](https://github.com/musixmatchresearch/umberto) is a Roberta-based Language Model trained on large Italian Corpora and uses two innovative approaches: SentencePiece and Whole Word Masking. Now available at [github.com/huggingface/transformers](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1) [UmBERTo](https://github.com/musixmatchresearch/umberto) is a Roberta-based Language Model trained on large Italian Corpora and uses two innovative approaches: SentencePiece and Whole Word Masking. Now available at [github.com/huggingface/transformers](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1)

View File

@@ -1,3 +1,7 @@
---
language: italian
---
# UmBERTo Wikipedia Uncased # UmBERTo Wikipedia Uncased
[UmBERTo](https://github.com/musixmatchresearch/umberto) is a Roberta-based Language Model trained on large Italian Corpora and uses two innovative approaches: SentencePiece and Whole Word Masking. Now available at [github.com/huggingface/transformers](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1) [UmBERTo](https://github.com/musixmatchresearch/umberto) is a Roberta-based Language Model trained on large Italian Corpora and uses two innovative approaches: SentencePiece and Whole Word Masking. Now available at [github.com/huggingface/transformers](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1)

View File

@@ -1,5 +1,5 @@
--- ---
thumbnail: https://github.com/JetRunner/BERT-of-Theseus/blob/master/bert-of-theseus.png?raw=true thumbnail: https://raw.githubusercontent.com/JetRunner/BERT-of-Theseus/master/bert-of-theseus.png
--- ---
# BERT-of-Theseus # BERT-of-Theseus

View File

@@ -1,3 +1,7 @@
---
language: german
---
# 🤗 + 📚 dbmdz German BERT models # 🤗 + 📚 dbmdz German BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@@ -1,3 +1,7 @@
---
language: german
---
# 🤗 + 📚 dbmdz German BERT models # 🤗 + 📚 dbmdz German BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models # 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models # 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models # 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@@ -1,3 +1,7 @@
---
language: italian
---
# 🤗 + 📚 dbmdz BERT models # 🤗 + 📚 dbmdz BERT models
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State

View File

@@ -1,3 +1,7 @@
---
language: dutch
---
# Multilingual + Dutch SQuAD2.0 # Multilingual + Dutch SQuAD2.0
This model is the multilingual model provided by the Google research team with a fine-tuned dutch Q&A downstream task. This model is the multilingual model provided by the Google research team with a fine-tuned dutch Q&A downstream task.