Docs: change some input_ids doc reference from BertTokenizer to AutoTokenizer (#24730)

This commit is contained in:
Joao Gante
2023-07-10 17:57:26 +01:00
committed by GitHub
parent 2541108564
commit a074a5d34d
14 changed files with 17 additions and 17 deletions

View File

@@ -32,7 +32,7 @@ LOGITS_PROCESSOR_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -17,7 +17,7 @@ STOPPING_CRITERIA_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -576,7 +576,7 @@ BART_INPUTS_DOCSTRING = r"""
input_ids (`tf.Tensor` of shape `({0})`): input_ids (`tf.Tensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -65,7 +65,7 @@ BRIDGETOWER_START_DOCSTRING = r"""
BRIDGETOWER_INPUTS_DOCSTRING = r""" BRIDGETOWER_INPUTS_DOCSTRING = r"""
Args: Args:
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`BertTokenizer`]. See Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input
IDs?](../glossary#input-ids) IDs?](../glossary#input-ids)

View File

@@ -943,7 +943,7 @@ CLIP_TEXT_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
@@ -1000,7 +1000,7 @@ CLIP_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -882,7 +882,7 @@ FUNNEL_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -1502,7 +1502,7 @@ GROUPVIT_TEXT_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
@@ -1560,7 +1560,7 @@ GROUPVIT_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -1560,7 +1560,7 @@ LED_INPUTS_DOCSTRING = r"""
input_ids (`tf.Tensor` of shape `({0})`): input_ids (`tf.Tensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -106,7 +106,7 @@ MMBT_INPUTS_DOCSTRING = r"""
Encoder, the shape would be (batch_size, channels, height, width) Encoder, the shape would be (batch_size, channels, height, width)
input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. It does not expect [CLS] token to be added as it's Indices of input sequence tokens in the vocabulary. It does not expect [CLS] token to be added as it's
appended to the end of other modality embeddings. Indices can be obtained using [`BertTokenizer`]. See appended to the end of other modality embeddings. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -761,7 +761,7 @@ MOBILEBERT_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -960,7 +960,7 @@ T5_INPUTS_DOCSTRING = r"""
Indices of input sequence tokens in the vocabulary. T5 is a model with relative position embeddings so you Indices of input sequence tokens in the vocabulary. T5 is a model with relative position embeddings so you
should be able to pad the inputs on the right or the left. should be able to pad the inputs on the right or the left.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -814,7 +814,7 @@ TRANSFO_XL_INPUTS_DOCSTRING = r"""
input_ids (`tf.Tensor` or `Numpy array` of shape `(batch_size, sequence_length)`): input_ids (`tf.Tensor` or `Numpy array` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)

View File

@@ -610,7 +610,7 @@ VILT_START_DOCSTRING = r"""
VILT_INPUTS_DOCSTRING = r""" VILT_INPUTS_DOCSTRING = r"""
Args: Args:
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`BertTokenizer`]. See Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input
IDs?](../glossary#input-ids) IDs?](../glossary#input-ids)
@@ -665,7 +665,7 @@ VILT_INPUTS_DOCSTRING = r"""
VILT_IMAGES_AND_TEXT_CLASSIFICATION_INPUTS_DOCSTRING = r""" VILT_IMAGES_AND_TEXT_CLASSIFICATION_INPUTS_DOCSTRING = r"""
Args: Args:
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`BertTokenizer`]. See Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input
IDs?](../glossary#input-ids) IDs?](../glossary#input-ids)

View File

@@ -851,7 +851,7 @@ class TF{{cookiecutter.camelcase_modelname}}PreTrainedModel(TFPreTrainedModel):
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]`, `Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]`, `Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.__call__`] and [`PreTrainedTokenizer.encode`] for [`PreTrainedTokenizer.__call__`] and [`PreTrainedTokenizer.encode`] for
details. details.