update to new script; notebook notes (#10241)
This commit is contained in:
@@ -258,17 +258,16 @@ To deploy this feature:
|
|||||||
2. Add ``--sharded_ddp`` to the command line arguments, and make sure you have added the distributed launcher ``-m
|
2. Add ``--sharded_ddp`` to the command line arguments, and make sure you have added the distributed launcher ``-m
|
||||||
torch.distributed.launch --nproc_per_node=NUMBER_OF_GPUS_YOU_HAVE`` if you haven't been using it already.
|
torch.distributed.launch --nproc_per_node=NUMBER_OF_GPUS_YOU_HAVE`` if you haven't been using it already.
|
||||||
|
|
||||||
For example here is how you could use it for ``finetune_trainer.py`` with 2 GPUs:
|
For example here is how you could use it for ``run_seq2seq.py`` with 2 GPUs:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
cd examples/seq2seq
|
python -m torch.distributed.launch --nproc_per_node=2 examples/seq2seq/run_seq2seq.py \
|
||||||
python -m torch.distributed.launch --nproc_per_node=2 ./finetune_trainer.py \
|
--model_name_or_path t5-small --per_device_train_batch_size 1 \
|
||||||
--model_name_or_path sshleifer/distill-mbart-en-ro-12-4 --data_dir wmt_en_ro \
|
|
||||||
--output_dir output_dir --overwrite_output_dir \
|
--output_dir output_dir --overwrite_output_dir \
|
||||||
--do_train --n_train 500 --num_train_epochs 1 \
|
--do_train --max_train_samples 500 --num_train_epochs 1 \
|
||||||
--per_device_train_batch_size 1 --freeze_embeds \
|
--dataset_name wmt16 --dataset_config "ro-en" \
|
||||||
--src_lang en_XX --tgt_lang ro_RO --task translation \
|
--task translation_en_to_ro --source_prefix "translate English to Romanian: " \
|
||||||
--fp16 --sharded_ddp
|
--fp16 --sharded_ddp
|
||||||
|
|
||||||
Notes:
|
Notes:
|
||||||
@@ -344,17 +343,18 @@ In fact, you can continue using ``-m torch.distributed.launch`` with DeepSpeed a
|
|||||||
the ``deepspeed`` launcher. But since in the DeepSpeed documentation it'll be used everywhere, for consistency we will
|
the ``deepspeed`` launcher. But since in the DeepSpeed documentation it'll be used everywhere, for consistency we will
|
||||||
use it here as well.
|
use it here as well.
|
||||||
|
|
||||||
Here is an example of running ``finetune_trainer.py`` under DeepSpeed deploying all available GPUs:
|
Here is an example of running ``run_seq2seq.py`` under DeepSpeed deploying all available GPUs:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
cd examples/seq2seq
|
deepspeed examples/seq2seq/run_seq2seq.py \
|
||||||
deepspeed ./finetune_trainer.py --deepspeed ds_config.json \
|
--deepspeed examples/tests/deepspeed/ds_config.json \
|
||||||
--model_name_or_path sshleifer/distill-mbart-en-ro-12-4 --data_dir wmt_en_ro \
|
--model_name_or_path t5-small --per_device_train_batch_size 1 \
|
||||||
--output_dir output_dir --overwrite_output_dir \
|
--output_dir output_dir --overwrite_output_dir --fp16 \
|
||||||
--do_train --n_train 500 --num_train_epochs 1 \
|
--do_train --max_train_samples 500 --num_train_epochs 1 \
|
||||||
--per_device_train_batch_size 1 --freeze_embeds \
|
--dataset_name wmt16 --dataset_config "ro-en" \
|
||||||
--src_lang en_XX --tgt_lang ro_RO --task translation
|
--task translation_en_to_ro --source_prefix "translate English to Romanian: "
|
||||||
|
|
||||||
|
|
||||||
Note that in the DeepSpeed documentation you are likely to see ``--deepspeed --deepspeed_config ds_config.json`` - i.e.
|
Note that in the DeepSpeed documentation you are likely to see ``--deepspeed --deepspeed_config ds_config.json`` - i.e.
|
||||||
two DeepSpeed-related arguments, but for the sake of simplicity, and since there are already so many arguments to deal
|
two DeepSpeed-related arguments, but for the sake of simplicity, and since there are already so many arguments to deal
|
||||||
@@ -372,13 +372,13 @@ To deploy DeepSpeed with one GPU adjust the :class:`~transformers.Trainer` comma
|
|||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
cd examples/seq2seq
|
deepspeed --num_gpus=1 examples/seq2seq/run_seq2seq.py \
|
||||||
deepspeed --num_gpus=1 ./finetune_trainer.py --deepspeed ds_config.json \
|
--deepspeed examples/tests/deepspeed/ds_config.json \
|
||||||
--model_name_or_path sshleifer/distill-mbart-en-ro-12-4 --data_dir wmt_en_ro \
|
--model_name_or_path t5-small --per_device_train_batch_size 1 \
|
||||||
--output_dir output_dir --overwrite_output_dir \
|
--output_dir output_dir --overwrite_output_dir --fp16 \
|
||||||
--do_train --n_train 500 --num_train_epochs 1 \
|
--do_train --max_train_samples 500 --num_train_epochs 1 \
|
||||||
--per_device_train_batch_size 1 --freeze_embeds \
|
--dataset_name wmt16 --dataset_config "ro-en" \
|
||||||
--src_lang en_XX --tgt_lang ro_RO --task translation
|
--task translation_en_to_ro --source_prefix "translate English to Romanian: "
|
||||||
|
|
||||||
This is almost the same as with multiple-GPUs, but here we tell DeepSpeed explicitly to use just one GPU. By default,
|
This is almost the same as with multiple-GPUs, but here we tell DeepSpeed explicitly to use just one GPU. By default,
|
||||||
DeepSpeed deploys all GPUs it can see. If you have only 1 GPU to start with, then you don't need this argument. The
|
DeepSpeed deploys all GPUs it can see. If you have only 1 GPU to start with, then you don't need this argument. The
|
||||||
@@ -424,17 +424,17 @@ Notes:
|
|||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
deepspeed --include localhost:1 ./finetune_trainer.py
|
deepspeed --include localhost:1 examples/seq2seq/run_seq2seq.py ...
|
||||||
|
|
||||||
In this example, we tell DeepSpeed to use GPU 1.
|
In this example, we tell DeepSpeed to use GPU 1 (second gpu).
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Deployment in Notebooks
|
Deployment in Notebooks
|
||||||
=======================================================================================================================
|
=======================================================================================================================
|
||||||
|
|
||||||
The problem with notebooks is that there is no normal ``deepspeed`` launcher to rely on, so under certain setups we
|
The problem with running notebook cells as a script is that there is no normal ``deepspeed`` launcher to rely on, so
|
||||||
have to emulate it.
|
under certain setups we have to emulate it.
|
||||||
|
|
||||||
Here is how you'd have to adjust your training code in the notebook to use DeepSpeed.
|
Here is how you'd have to adjust your training code in the notebook to use DeepSpeed.
|
||||||
|
|
||||||
@@ -510,6 +510,24 @@ cell with:
|
|||||||
EOT
|
EOT
|
||||||
|
|
||||||
|
|
||||||
|
That's said if the script is not in the notebook cells, you can launch ``deepspeed`` normally via shell from a cell
|
||||||
|
with:
|
||||||
|
|
||||||
|
.. code-block::
|
||||||
|
|
||||||
|
!deepspeed examples/seq2seq/run_seq2seq.py ...
|
||||||
|
|
||||||
|
or with bash magic, where you can write a multi-line code for the shell to run:
|
||||||
|
|
||||||
|
.. code-block::
|
||||||
|
|
||||||
|
%%bash
|
||||||
|
|
||||||
|
cd /somewhere
|
||||||
|
deepspeed examples/seq2seq/run_seq2seq.py ...
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Configuration
|
Configuration
|
||||||
=======================================================================================================================
|
=======================================================================================================================
|
||||||
|
|||||||
Reference in New Issue
Block a user