Framework split (#16030)

* First files

* More files

* Last files

* Style
This commit is contained in:
Sylvain Gugger
2022-03-15 10:13:34 -04:00
committed by GitHub
parent 4a353cacb7
commit 4f4e5ddbcb
17 changed files with 465 additions and 132 deletions

View File

@@ -81,6 +81,8 @@ pip install -r requirements.txt
## Run a script
<frameworkcontent>
<pt>
The example script downloads and preprocesses a dataset from the 🤗 [Datasets](https://huggingface.co/docs/datasets/) library. Then the script fine-tunes a dataset with the [Trainer](https://huggingface.co/docs/transformers/main_classes/trainer) on an architecture that supports summarization. The following example shows how to fine-tune [T5-small](https://huggingface.co/t5-small) on the [CNN/DailyMail](https://huggingface.co/datasets/cnn_dailymail) dataset. The T5 model requires an additional `source_prefix` argument due to how it was trained. This prompt lets T5 know this is a summarization task.
```bash
@@ -96,7 +98,12 @@ python examples/pytorch/summarization/run_summarization.py \
--per_device_eval_batch_size=4 \
--overwrite_output_dir \
--predict_with_generate
===PT-TF-SPLIT===
```
</pt>
<tf>
The example script downloads and preprocesses a dataset from the 🤗 [Datasets](https://huggingface.co/docs/datasets/) library. Then the script fine-tunes a dataset using Keras on an architecture that supports summarization. The following example shows how to fine-tune [T5-small](https://huggingface.co/t5-small) on the [CNN/DailyMail](https://huggingface.co/datasets/cnn_dailymail) dataset. The T5 model requires an additional `source_prefix` argument due to how it was trained. This prompt lets T5 know this is a summarization task.
```bash
python examples/tensorflow/summarization/run_summarization.py \
--model_name_or_path t5-small \
--dataset_name cnn_dailymail \
@@ -108,6 +115,8 @@ python examples/tensorflow/summarization/run_summarization.py \
--do_train \
--do_eval
```
</tf>
</frameworkcontent>
## Distributed training and mixed precision
@@ -137,10 +146,10 @@ TensorFlow scripts utilize a [`MirroredStrategy`](https://www.tensorflow.org/gui
## Run a script on a TPU
<frameworkcontent>
<pt>
Tensor Processing Units (TPUs) are specifically designed to accelerate performance. PyTorch supports TPUs with the [XLA](https://www.tensorflow.org/xla) deep learning compiler (see [here](https://github.com/pytorch/xla/blob/master/README.md) for more details). To use a TPU, launch the `xla_spawn.py` script and use the `num_cores` argument to set the number of TPU cores you want to use.
TensorFlow scripts utilize a [`TPUStrategy`](https://www.tensorflow.org/guide/distributed_training#tpustrategy) for training on TPUs. To use a TPU, pass the name of the TPU resource to the `tpu` argument.
```bash
python xla_spawn.py --num_cores 8 \
summarization/run_summarization.py \
@@ -155,7 +164,12 @@ python xla_spawn.py --num_cores 8 \
--per_device_eval_batch_size=4 \
--overwrite_output_dir \
--predict_with_generate
===PT-TF-SPLIT===
```
</pt>
<tf>
Tensor Processing Units (TPUs) are specifically designed to accelerate performance. TensorFlow scripts utilize a [`TPUStrategy`](https://www.tensorflow.org/guide/distributed_training#tpustrategy) for training on TPUs. To use a TPU, pass the name of the TPU resource to the `tpu` argument.
```bash
python run_summarization.py \
--tpu name_of_tpu_resource \
--model_name_or_path t5-small \
@@ -168,6 +182,8 @@ python run_summarization.py \
--do_train \
--do_eval
```
</tf>
</frameworkcontent>
## Run a script with 🤗 Accelerate