From 641adca55832ed9c5648f54dcd8926d67d3511db Mon Sep 17 00:00:00 2001
From: Victor Geislinger <9027783+MrGeislinger@users.noreply.github.com>
Date: Thu, 3 Aug 2023 14:17:30 -0700
Subject: [PATCH] Fix typo: Roberta -> RoBERTa (#25302)

---
 docs/source/en/tokenizer_summary.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/en/tokenizer_summary.md b/docs/source/en/tokenizer_summary.md
index b13c4e83b8..5a23c7bf84 100644
--- a/docs/source/en/tokenizer_summary.md
+++ b/docs/source/en/tokenizer_summary.md
@@ -141,7 +141,7 @@ on.
 
 Byte-Pair Encoding (BPE) was introduced in [Neural Machine Translation of Rare Words with Subword Units (Sennrich et
 al., 2015)](https://arxiv.org/abs/1508.07909). BPE relies on a pre-tokenizer that splits the training data into
-words. Pretokenization can be as simple as space tokenization, e.g. [GPT-2](model_doc/gpt2), [Roberta](model_doc/roberta). More advanced pre-tokenization include rule-based tokenization, e.g. [XLM](model_doc/xlm),
+words. Pretokenization can be as simple as space tokenization, e.g. [GPT-2](model_doc/gpt2), [RoBERTa](model_doc/roberta). More advanced pre-tokenization include rule-based tokenization, e.g. [XLM](model_doc/xlm),
 [FlauBERT](model_doc/flaubert) which uses Moses for most languages, or [GPT](model_doc/gpt) which uses
 Spacy and ftfy, to count the frequency of each word in the training corpus.