From 8eeefcb576b48f0ed14bc75c4d7d187afd2a6feb Mon Sep 17 00:00:00 2001
From: Kyeongpil Kang <rudvlf0413@korea.ac.kr>
Date: Fri, 20 Mar 2020 07:21:49 +0900
Subject: [PATCH] Update 01-training-tokenizers.ipynb (typo issue) (#3343)

I found there are two grammar errors or typo issues in the explanation of the encoding properties.

The original sentences:
If your was made of multiple \"parts\" such as (question, context), then this would be a vector with for each token the segment it belongs to
If your has been truncated into multiple subparts because of a length limit (for BERT for example the sequence length is limited to 512), this will contain all the remaining overflowing parts.

I think "input" should be inserted after the phrase "If your".
---
 notebooks/01-training-tokenizers.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/notebooks/01-training-tokenizers.ipynb b/notebooks/01-training-tokenizers.ipynb
index 1a56594961..59def58eb4 100644
--- a/notebooks/01-training-tokenizers.ipynb
+++ b/notebooks/01-training-tokenizers.ipynb
@@ -332,8 +332,8 @@
     "- input_ids: The generated tokens with their integer representation\n",
     "- attention_mask: If your input has been padded by the tokenizer, then this would be a vector of 1 for any non padded token and 0 for padded ones.\n",
     "- special_token_mask: If your input contains special tokens such as [CLS], [SEP], [MASK], [PAD], then this would be a vector with 1 in places where a special token has been added.\n",
-    "- type_ids: If your was made of multiple \"parts\" such as (question, context), then this would be a vector with for each token the segment it belongs to.\n",
-    "- overflowing: If your has been truncated into multiple subparts because of a length limit (for BERT for example the sequence length is limited to 512), this will contain all the remaining overflowing parts."
+    "- type_ids: If your input was made of multiple \"parts\" such as (question, context), then this would be a vector with for each token the segment it belongs to.\n",
+    "- overflowing: If your input has been truncated into multiple subparts because of a length limit (for BERT for example the sequence length is limited to 512), this will contain all the remaining overflowing parts."
    ]
   }
  ],