[model_cards] Add model cards for Urduhack model (roberta-urdu-small) (#6536)
* [model_cards] roberta-urdu-small added. * [model_cards] typo fixed. * Tweak license format (yaml expects a simple string) Co-authored-by: Ikram Ali <mrikram1989> Co-authored-by: Julien Chaumond <chaumond@gmail.com>
This commit is contained in:
30
model_cards/urduhack/roberta-urdu-small/README.md
Normal file
30
model_cards/urduhack/roberta-urdu-small/README.md
Normal file
@@ -0,0 +1,30 @@
|
|||||||
|
---
|
||||||
|
language: ur
|
||||||
|
thumbnail: https://raw.githubusercontent.com/urduhack/urduhack/master/docs/_static/urduhack.png
|
||||||
|
tags:
|
||||||
|
- roberta-urdu-small
|
||||||
|
- urdu
|
||||||
|
- transformers
|
||||||
|
license: mit
|
||||||
|
---
|
||||||
|
## roberta-urdu-small
|
||||||
|
|
||||||
|
[](https://github.com/urduhack/urduhack/blob/master/LICENSE)
|
||||||
|
### Overview
|
||||||
|
**Language model:** roberta-urdu-small
|
||||||
|
**Model size:** 125M
|
||||||
|
**Language:** Urdu
|
||||||
|
**Training data:** News data from urdu news resources in Pakistan
|
||||||
|
### About roberta-urdu-small
|
||||||
|
roberta-urdu-small is a language model for urdu language.
|
||||||
|
```
|
||||||
|
from transformers import pipeline
|
||||||
|
fill_mask = pipeline("fill-mask", model="urduhack/roberta-urdu-small", tokenizer="urduhack/roberta-urdu-small")
|
||||||
|
```
|
||||||
|
## Training procedure
|
||||||
|
roberta-urdu-small was trained on urdu news corpus. Training data was normalized using normalization module from
|
||||||
|
urduhack to eliminate characters from other languages like arabic.
|
||||||
|
|
||||||
|
### About Urduhack
|
||||||
|
Urduhack is a Natural Language Processing (NLP) library for urdu language.
|
||||||
|
Github: https://github.com/urduhack/urduhack
|
||||||
Reference in New Issue
Block a user