From 6d6b916f48c10483f6b0f07263568e79c8797c9e Mon Sep 17 00:00:00 2001 From: Thomas Wolf Date: Sun, 11 Nov 2018 17:00:49 +0100 Subject: [PATCH] update to BERT-large results --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 4eb31a0ece..17fb2f1f70 100644 --- a/README.md +++ b/README.md @@ -206,7 +206,7 @@ Training with the previous hyper-parameters gave us the following results: The options we list above allow to fine-tune BERT-large rather easily on GPU(s) instead of the TPU used by the original implementation. -For example, fine-tuning BERT-large on SQuAD can be done on a server with 4 k-80 (these are pretty old now) in 18 hours. Our results are similar to the TensorFlow implementation results: +For example, fine-tuning BERT-large on SQuAD can be done on a server with 4 k-80 (these are pretty old now) in 18 hours. Our results are similar to the TensorFlow implementation results (actually slightly higher): ```bash {"exact_match": 84.56953642384106, "f1": 91.04028647786927} ```