From 44c7857b873e535d8a200000b1da2ec23cf74273 Mon Sep 17 00:00:00 2001 From: Stas Bekman Date: Mon, 31 Jan 2022 08:28:10 -0800 Subject: [PATCH] [deepspeed doc] fix import, extra notes (#15400) * [deepspeed doc] fix import, extra notes * typo --- docs/source/main_classes/deepspeed.mdx | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/docs/source/main_classes/deepspeed.mdx b/docs/source/main_classes/deepspeed.mdx index 78381264a0..074eb2f777 100644 --- a/docs/source/main_classes/deepspeed.mdx +++ b/docs/source/main_classes/deepspeed.mdx @@ -1708,7 +1708,7 @@ Work is being done to enable estimating how much memory is needed for a specific ## Non-Trainer Deepspeed Integration The [`~deepspeed.HfDeepSpeedConfig`] is used to integrate Deepspeed into the 🤗 Transformers core -functionality, when [`Trainer`] is not used. +functionality, when [`Trainer`] is not used. The only thing that it does is handling Deepspeed ZeRO 3 param gathering and automatically splitting the model onto multiple gpus during `from_pretrained` call. Everything else you have to do by yourself. When using [`Trainer`] everything is automatically taken care of. @@ -1719,10 +1719,11 @@ For example for a pretrained model: ```python from transformers.deepspeed import HfDeepSpeedConfig -from transformers import AutoModel, deepspeed +from transformers import AutoModel +import deepspeed ds_config = {...} # deepspeed config object or path to the file -# must run before instantiating the model +# must run before instantiating the model to detect zero 3 dschf = HfDeepSpeedConfig(ds_config) # keep this object alive model = AutoModel.from_pretrained("gpt2") engine = deepspeed.initialize(model=model, config_params=ds_config, ...) @@ -1732,16 +1733,19 @@ or for non-pretrained model: ```python from transformers.deepspeed import HfDeepSpeedConfig -from transformers import AutoModel, AutoConfig, deepspeed +from transformers import AutoModel, AutoConfig +import deepspeed ds_config = {...} # deepspeed config object or path to the file -# must run before instantiating the model +# must run before instantiating the model to detect zero 3 dschf = HfDeepSpeedConfig(ds_config) # keep this object alive config = AutoConfig.from_pretrained("gpt2") model = AutoModel.from_config(config) engine = deepspeed.initialize(model=model, config_params=ds_config, ...) ``` +Please note that if you're not using the [`Trainer`] integration, you're completely on your own. Basically follow the documentation on the [Deepspeed](https://www.deepspeed.ai/) website. Also you have to configure explicitly the config file - you can't use `"auto"` values and you will have to put real values instead. + ## HfDeepSpeedConfig [[autodoc]] deepspeed.HfDeepSpeedConfig