From 05db5bc1afea196e548ae3214d3413c321fcfda1 Mon Sep 17 00:00:00 2001
From: Thomas Wolf <thomwolf@users.noreply.github.com>
Date: Thu, 14 Nov 2019 22:40:22 +0100
Subject: [PATCH] added small comparison between BERT, RoBERTa and DistilBERT

---
 examples/README.md | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/examples/README.md b/examples/README.md
index 2b66b92f1a..abb4cb6e5a 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -554,6 +554,16 @@ On the test dataset the following results could be achieved:
 10/04/2019 00:42:42 - INFO - __main__ -     recall = 0.8624150210424085
 ```
 
+### Comparing BERT (large, cased), RoBERTa (large, cased) and DistilBERT (base, uncased)
+
+Here is a small comparison between BERT (large, cased), RoBERTa (large, cased) and DistilBERT (base, uncased) with the same hyperparameters as specified in the [example documentation](https://huggingface.co/transformers/examples.html#named-entity-recognition) (one run):
+
+| Model | F-Score Dev | F-Score Test
+| --------------------------------- | ------- | --------
+| `bert-large-cased`            | 95.59 | 91.70
+| `roberta-large`                  | 95.96 | 91.87
+| `distilbert-base-uncased` | 94.34 | 90.32
+
 ## Abstractive summarization
 
 Based on the script