diff --git a/docs/source/en/quicktour.mdx b/docs/source/en/quicktour.mdx
index 3e2eadc1f7..4c7defc039 100644
--- a/docs/source/en/quicktour.mdx
+++ b/docs/source/en/quicktour.mdx
@@ -528,7 +528,7 @@ All models are a standard [`tf.keras.Model`](https://www.tensorflow.org/api_docs
    ```py
    >>> dataset = dataset.map(tokenize_dataset)  # doctest: +SKIP
    >>> tf_dataset = model.prepare_tf_dataset(
-   ...     dataset, batch_size=16, shuffle=True, tokenizer=tokenizer
+   ...     dataset["train"], batch_size=16, shuffle=True, tokenizer=tokenizer
    ... )  # doctest: +SKIP
    ```
 
@@ -538,7 +538,7 @@ All models are a standard [`tf.keras.Model`](https://www.tensorflow.org/api_docs
    >>> from tensorflow.keras.optimizers import Adam
 
    >>> model.compile(optimizer=Adam(3e-5))
-   >>> model.fit(dataset)  # doctest: +SKIP
+   >>> model.fit(tf_dataset)  # doctest: +SKIP
    ```
 
 ## What's next?
diff --git a/docs/source/en/training.mdx b/docs/source/en/training.mdx
index 4d802db563..52b4f157a3 100644
--- a/docs/source/en/training.mdx
+++ b/docs/source/en/training.mdx
@@ -247,7 +247,7 @@ reduces the number of padding tokens compared to padding the entire dataset.
 
 
 ```py
->>> tf_dataset = model.prepare_tf_dataset(dataset, batch_size=16, shuffle=True, tokenizer=tokenizer)
+>>> tf_dataset = model.prepare_tf_dataset(dataset["train"], batch_size=16, shuffle=True, tokenizer=tokenizer)
 ```
 
 Note that in the code sample above, you need to pass the tokenizer to `prepare_tf_dataset` so it can correctly pad batches as they're loaded.