diff --git a/docs/source/en/quicktour.mdx b/docs/source/en/quicktour.mdx index 3e2eadc1f7..4c7defc039 100644 --- a/docs/source/en/quicktour.mdx +++ b/docs/source/en/quicktour.mdx @@ -528,7 +528,7 @@ All models are a standard [`tf.keras.Model`](https://www.tensorflow.org/api_docs ```py >>> dataset = dataset.map(tokenize_dataset) # doctest: +SKIP >>> tf_dataset = model.prepare_tf_dataset( - ... dataset, batch_size=16, shuffle=True, tokenizer=tokenizer + ... dataset["train"], batch_size=16, shuffle=True, tokenizer=tokenizer ... ) # doctest: +SKIP ``` @@ -538,7 +538,7 @@ All models are a standard [`tf.keras.Model`](https://www.tensorflow.org/api_docs >>> from tensorflow.keras.optimizers import Adam >>> model.compile(optimizer=Adam(3e-5)) - >>> model.fit(dataset) # doctest: +SKIP + >>> model.fit(tf_dataset) # doctest: +SKIP ``` ## What's next? diff --git a/docs/source/en/training.mdx b/docs/source/en/training.mdx index 4d802db563..52b4f157a3 100644 --- a/docs/source/en/training.mdx +++ b/docs/source/en/training.mdx @@ -247,7 +247,7 @@ reduces the number of padding tokens compared to padding the entire dataset. ```py ->>> tf_dataset = model.prepare_tf_dataset(dataset, batch_size=16, shuffle=True, tokenizer=tokenizer) +>>> tf_dataset = model.prepare_tf_dataset(dataset["train"], batch_size=16, shuffle=True, tokenizer=tokenizer) ``` Note that in the code sample above, you need to pass the tokenizer to `prepare_tf_dataset` so it can correctly pad batches as they're loaded.