Fix some typos in the docs (#14126)

* Fix some typos in the docs * Fix a styling issue * Fix code quality check error
2021-10-25 15:10:44 +03:30
parent 95bab53868
commit 6b83090e80
5 changed files with 9 additions and 8 deletions
--- a/docs/source/tokenizer_summary.rst
+++ b/docs/source/tokenizer_summary.rst
@@ -182,9 +182,10 @@ base vocabulary, we obtain:

 BPE then counts the frequency of each possible symbol pair and picks the symbol pair that occurs most frequently. In
 the example above ``"h"`` followed by ``"u"`` is present `10 + 5 = 15` times (10 times in the 10 occurrences of
-``"hug"``, 5 times in the 5 occurrences of "hugs"). However, the most frequent symbol pair is ``"u"`` followed by "g",
-occurring `10 + 5 + 5 = 20` times in total. Thus, the first merge rule the tokenizer learns is to group all ``"u"``
-symbols followed by a ``"g"`` symbol together. Next, "ug" is added to the vocabulary. The set of words then becomes
+``"hug"``, 5 times in the 5 occurrences of ``"hugs"``). However, the most frequent symbol pair is ``"u"`` followed by
+``"g"``, occurring `10 + 5 + 5 = 20` times in total. Thus, the first merge rule the tokenizer learns is to group all
+``"u"`` symbols followed by a ``"g"`` symbol together. Next, ``"ug"`` is added to the vocabulary. The set of words then
+becomes

 .. code-block::