From ad908686272801d9c53effad8ec513ef2cda85dd Mon Sep 17 00:00:00 2001 From: thomwolf Date: Mon, 4 Nov 2019 11:27:22 +0100 Subject: [PATCH] Update example readme --- examples/README.md | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/examples/README.md b/examples/README.md index 382d794fcb..9f00c45225 100644 --- a/examples/README.md +++ b/examples/README.md @@ -8,7 +8,7 @@ similar API between the different models. | [Language Model fine-tuning](#language-model-fine-tuning) | Fine-tuning the library models for language modeling on a text dataset. Causal language modeling for GPT/GPT-2, masked language modeling for BERT/RoBERTa. | | [Language Generation](#language-generation) | Conditional text generation using the auto-regressive models of the library: GPT, GPT-2, Transformer-XL and XLNet. | | [GLUE](#glue) | Examples running BERT/XLM/XLNet/RoBERTa on the 9 GLUE tasks. Examples feature distributed training as well as half-precision. | -| [SQuAD](#squad) | Using BERT for question answering, examples with distributed training. | +| [SQuAD](#squad) | Using BERT/RoBERTa/XLNet/XLM for question answering, examples with distributed training. | | [Multiple Choice](#multiple-choice) | Examples running BERT/XLNet/RoBERTa on the SWAG/RACE/ARC tasks. ## Language model fine-tuning @@ -390,3 +390,40 @@ exact_match = 86.91 This fine-tuneds model is available as a checkpoint under the reference `bert-large-uncased-whole-word-masking-finetuned-squad`. +#### Fine-tuning XLNet on SQuAD + +This example code fine-tunes XLNet on the SQuAD dataset. See above to download the data for SQuAD . + +```bash +export SQUAD_DIR=/path/to/SQUAD + +python /data/home/hlu/transformers/examples/run_squad.py \ + --model_type xlnet \ + --model_name_or_path xlnet-large-cased \ + --do_train \ + --do_eval \ + --do_lower_case \ + --train_file /data/home/hlu/notebooks/NLP/examples/question_answering/train-v1.1.json \ + --predict_file /data/home/hlu/notebooks/NLP/examples/question_answering/dev-v1.1.json \ + --learning_rate 3e-5 \ + --num_train_epochs 2 \ + --max_seq_length 384 \ + --doc_stride 128 \ + --output_dir ./wwm_cased_finetuned_squad/ \ + --per_gpu_eval_batch_size=4 \ + --per_gpu_train_batch_size=4 \ + --save_steps 5000 +``` + +Training with the previously defined hyper-parameters yields the following results: + +```python +{ +"exact": 85.45884578997162, +"f1": 92.5974600601065, +"total": 10570, +"HasAns_exact": 85.45884578997162, +"HasAns_f1": 92.59746006010651, +"HasAns_total": 10570 +} +```