Doc to dataset (#18037)

* Link to the Datasets doc

* Remove unwanted file
This commit is contained in:
Sylvain Gugger
2022-07-06 12:10:06 -04:00
committed by GitHub
parent be79cd7d8e
commit 2e90c3df8f
16 changed files with 34 additions and 34 deletions

View File

@@ -33,7 +33,7 @@ pip install transformers datasets accelerate nvidia-ml-py3
The `nvidia-ml-py3` library allows us to monitor the memory usage of the models from within Python. You might be familiar with the `nvidia-smi` command in the terminal - this library allows to access the same information in Python directly.
Then we create some dummy data. We create random token IDs between 100 and 30000 and binary labels for a classifier. In total we get 512 sequences each with length 512 and store them in a [`Dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html?highlight=dataset#datasets.Dataset) with PyTorch format.
Then we create some dummy data. We create random token IDs between 100 and 30000 and binary labels for a classifier. In total we get 512 sequences each with length 512 and store them in a [`~datasets.Dataset`] with PyTorch format.
```py