From 6a13448ad2d935d38e883f5671a16ce1b70c4e34 Mon Sep 17 00:00:00 2001 From: Manuel Romero Date: Tue, 10 Mar 2020 05:06:08 +0100 Subject: [PATCH] Update README.md - Fix path of tokenizer - Clarify that the model is not trained on the evaluation set --- .../mrm8488/bert-multi-uncased-finetuned-xquadv1/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/model_cards/mrm8488/bert-multi-uncased-finetuned-xquadv1/README.md b/model_cards/mrm8488/bert-multi-uncased-finetuned-xquadv1/README.md index fed8159163..79acde75b9 100644 --- a/model_cards/mrm8488/bert-multi-uncased-finetuned-xquadv1/README.md +++ b/model_cards/mrm8488/bert-multi-uncased-finetuned-xquadv1/README.md @@ -65,7 +65,7 @@ Citation: -I used `Data augmentation techniques` to obtain more samples and splited the dataset in order to have a train and test set. The test set was created in a way that contains the same number of samples for each language. Finally, I got: +As **XQuAD** is just an evaluation dataset, I used `Data augmentation techniques` (scraping, neural machine translation, etc) to obtain more samples and splited the dataset in order to have a train and test set. The test set was created in a way that contains the same number of samples for each language. Finally, I got: | Dataset | # samples | | ----------- | --------- | @@ -101,7 +101,7 @@ from transformers import pipeline qa_pipeline = pipeline( "question-answering", model="mrm8488/bert-multi-uncased-finetuned-xquadv1", - tokenizer="bert-multi-uncased-finetuned-xquadv1" + tokenizer="mrm8488/bert-multi-uncased-finetuned-xquadv1" )