From b8b1e442e3bc43c97a68152313d3f84e3e0d03a0 Mon Sep 17 00:00:00 2001 From: Steven Basart <130421631+steven-basart@users.noreply.github.com> Date: Tue, 23 Apr 2024 12:04:17 -0400 Subject: [PATCH] Rename torch.run to torchrun (#30405) torch.run does not exist anywhere as far as I can tell. --- docs/source/en/deepspeed.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/deepspeed.md b/docs/source/en/deepspeed.md index eacd6e1c10..868021a9cd 100644 --- a/docs/source/en/deepspeed.md +++ b/docs/source/en/deepspeed.md @@ -659,7 +659,7 @@ You could also use the [`Trainer`]'s `--save_on_each_node` argument to automatic For [torchrun](https://pytorch.org/docs/stable/elastic/run.html), you have to ssh to each node and run the following command on both of them. The launcher waits until both nodes are synchronized before launching the training. ```bash -python -m torch.run --nproc_per_node=8 --nnode=2 --node_rank=0 --master_addr=hostname1 \ +torchrun --nproc_per_node=8 --nnode=2 --node_rank=0 --master_addr=hostname1 \ --master_port=9901 your_program.py --deepspeed ds_config.json ```