From 7516bcf27319a2aea9bbe927f8e4d8e501e23c99 Mon Sep 17 00:00:00 2001
From: Romain Rigaux <romain.rigaux@gmail.com>
Date: Tue, 18 Aug 2020 07:23:25 -0700
Subject: [PATCH] [docs] Fix number of 'ug' occurrences in tokenizer_summary
 (#6574)

---
 docs/source/tokenizer_summary.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/tokenizer_summary.rst b/docs/source/tokenizer_summary.rst
index 51e2ce160f..72b322a32c 100644
--- a/docs/source/tokenizer_summary.rst
+++ b/docs/source/tokenizer_summary.rst
@@ -130,7 +130,7 @@ Then the base vocabulary is ['b', 'g', 'h', 'n', 'p', 's', 'u'] and all our word
 
 We then take each pair of symbols and look at the most frequent. For instance 'hu' is present `10 + 5 = 15` times (10
 times in the 10 occurrences of 'hug', 5 times in the 5 occurrences of 'hugs'). The most frequent here is 'ug', present
-`10 + 5 + 2 + 5 = 22` times in total. So the first merge rule the tokenizer learns is to group all 'u' and 'g' together
+`10 + 5 + 5 = 20` times in total. So the first merge rule the tokenizer learns is to group all 'u' and 'g' together
 then it adds 'ug' to the vocabulary. Our corpus then becomes
 
 ::