Fix many typos (#8708)
This commit is contained in:
@@ -4,7 +4,7 @@ language: sv
|
||||
|
||||
# Swedish BERT Models
|
||||
|
||||
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.
|
||||
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on approximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.
|
||||
|
||||
The following three models are currently available:
|
||||
|
||||
@@ -86,7 +86,7 @@ for token in nlp(text):
|
||||
print(l)
|
||||
```
|
||||
|
||||
Which should result in the following (though less cleanly formated):
|
||||
Which should result in the following (though less cleanly formatted):
|
||||
|
||||
```python
|
||||
[ { 'word': 'Engelbert', 'score': 0.99..., 'entity': 'PRS'},
|
||||
@@ -104,7 +104,7 @@ Which should result in the following (though less cleanly formated):
|
||||
|
||||
### ALBERT base
|
||||
|
||||
The easisest way to do this is, again, using Huggingface Transformers:
|
||||
The easiest way to do this is, again, using Huggingface Transformers:
|
||||
|
||||
```python
|
||||
from transformers import AutoModel,AutoTokenizer
|
||||
|
||||
@@ -4,7 +4,7 @@ language: sv
|
||||
|
||||
# Swedish BERT Models
|
||||
|
||||
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on aproximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.
|
||||
The National Library of Sweden / KBLab releases three pretrained language models based on BERT and ALBERT. The models are trained on approximately 15-20GB of text (200M sentences, 3000M tokens) from various sources (books, news, government publications, swedish wikipedia and internet forums) aiming to provide a representative BERT model for Swedish text. A more complete description will be published later on.
|
||||
|
||||
The following three models are currently available:
|
||||
|
||||
@@ -86,7 +86,7 @@ for token in nlp(text):
|
||||
print(l)
|
||||
```
|
||||
|
||||
Which should result in the following (though less cleanly formated):
|
||||
Which should result in the following (though less cleanly formatted):
|
||||
|
||||
```python
|
||||
[ { 'word': 'Engelbert', 'score': 0.99..., 'entity': 'PRS'},
|
||||
@@ -104,7 +104,7 @@ Which should result in the following (though less cleanly formated):
|
||||
|
||||
### ALBERT base
|
||||
|
||||
The easisest way to do this is, again, using Huggingface Transformers:
|
||||
The easiest way to do this is, again, using Huggingface Transformers:
|
||||
|
||||
```python
|
||||
from transformers import AutoModel,AutoTokenizer
|
||||
|
||||
@@ -4,7 +4,7 @@ tags:
|
||||
---
|
||||
|
||||
## CS224n SQuAD2.0 Project Dataset
|
||||
The goal of this model is to save CS224n students GPU time when establising
|
||||
The goal of this model is to save CS224n students GPU time when establishing
|
||||
baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
|
||||
The training set used to fine-tune this model is the same as
|
||||
the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
|
||||
|
||||
@@ -4,7 +4,7 @@ tags:
|
||||
---
|
||||
|
||||
## CS224n SQuAD2.0 Project Dataset
|
||||
The goal of this model is to save CS224n students GPU time when establising
|
||||
The goal of this model is to save CS224n students GPU time when establishing
|
||||
baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
|
||||
The training set used to fine-tune this model is the same as
|
||||
the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
## CS224n SQuAD2.0 Project Dataset
|
||||
The goal of this model is to save CS224n students GPU time when establising
|
||||
The goal of this model is to save CS224n students GPU time when establishing
|
||||
baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
|
||||
The training set used to fine-tune this model is the same as
|
||||
the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
## CS224n SQuAD2.0 Project Dataset
|
||||
The goal of this model is to save CS224n students GPU time when establising
|
||||
The goal of this model is to save CS224n students GPU time when establishing
|
||||
baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
|
||||
The training set used to fine-tune this model is the same as
|
||||
the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the downstream task (Intent Prediction) - Dataset 📚
|
||||
|
||||
Dataset ID: ```event2Mind``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
|
||||
Dataset ID: ```event2Mind``` from [Huggingface/NLP](https://github.com/huggingface/nlp)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
|
||||
|
||||
Dataset ID: ```squad``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
|
||||
Dataset ID: ```squad``` from [Huggingface/NLP](https://github.com/huggingface/nlp)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
|
||||
|
||||
Dataset ID: ```squad_v2``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
|
||||
Dataset ID: ```squad_v2``` from [Huggingface/NLP](https://github.com/huggingface/nlp)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the Dataset 📚
|
||||
|
||||
Dataset ID: ```wikisql``` from [HugginFace/NLP](https://huggingface.co/nlp/viewer/?dataset=wikisql)
|
||||
Dataset ID: ```wikisql``` from [Huggingface/NLP](https://huggingface.co/nlp/viewer/?dataset=wikisql)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the Dataset 📚
|
||||
|
||||
Dataset ID: ```wikisql``` from [HugginFace/NLP](https://huggingface.co/nlp/viewer/?dataset=wikisql)
|
||||
Dataset ID: ```wikisql``` from [Huggingface/NLP](https://huggingface.co/nlp/viewer/?dataset=wikisql)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the downstream task (Question Paraphrasing) - Dataset 📚❓↔️❓
|
||||
|
||||
Dataset ID: ```quora``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
|
||||
Dataset ID: ```quora``` from [Huggingface/NLP](https://github.com/huggingface/nlp)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
|
||||
|
||||
Dataset ID: ```squad``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
|
||||
Dataset ID: ```squad``` from [Huggingface/NLP](https://github.com/huggingface/nlp)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
|
||||
|
||||
Dataset ID: ```squad_v2``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
|
||||
Dataset ID: ```squad_v2``` from [Huggingface/NLP](https://github.com/huggingface/nlp)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
||||
|
||||
## Details of the Dataset 📚
|
||||
|
||||
Dataset ID: ```wikisql``` from [HugginFace/NLP](https://huggingface.co/nlp/viewer/?dataset=wikisql)
|
||||
Dataset ID: ```wikisql``` from [Huggingface/NLP](https://huggingface.co/nlp/viewer/?dataset=wikisql)
|
||||
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
|
||||
Reference in New Issue
Block a user