Fix (DeepSpeed) docker image build issue (#21002)

* Fix docker image build issue

* remove comment

* Add comment

* Update docker/transformers-pytorch-deepspeed-latest-gpu/Dockerfile

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
This commit is contained in:
Yih-Dar
2023-01-04 21:28:33 +01:00
committed by GitHub
parent b91048968b
commit 94db82573e
2 changed files with 2 additions and 4 deletions

View File

@@ -27,7 +27,8 @@ RUN python3 -m pip install torch-tensorrt==1.3.0 --find-links https://github.com
# recompile apex
RUN python3 -m pip uninstall -y apex
RUN git clone https://github.com/NVIDIA/apex
RUN cd apex && python3 -m pip install --global-option="--cpp_ext" --global-option="--cuda_ext" --no-cache -v --disable-pip-version-check .
# `MAX_JOBS=1` disables parallel building to avoid cpu memory OOM when building image on GitHub Action (standard) runners
RUN cd apex && MAX_JOBS=1 python3 -m pip install --global-option="--cpp_ext" --global-option="--cuda_ext" --no-cache -v --disable-pip-version-check .
# Pre-build **latest** DeepSpeed, so it would be ready for testing (otherwise, the 1st deepspeed test will timeout)
RUN python3 -m pip uninstall -y deepspeed