From 27a8c9e4f189462c1cc4206317a27032887c8513 Mon Sep 17 00:00:00 2001 From: Stas Bekman Date: Wed, 21 Jul 2021 15:11:02 -0700 Subject: [PATCH] [parallelism doc] document Deepspeed-Inference and parallelformers (#12836) * document Deepspeed-Inference and parallelformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --- docs/source/parallelism.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/source/parallelism.md b/docs/source/parallelism.md index 084f381e27..1e5265bf5b 100644 --- a/docs/source/parallelism.md +++ b/docs/source/parallelism.md @@ -224,7 +224,11 @@ Implementations: - DeepSpeed calls it [tensor slicing](https://www.deepspeed.ai/features/#model-parallelism) - [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) has an internal implementation. -🤗 Transformers status: not yet implemented +🤗 Transformers status: +- core: not yet implemented in the core +- but if you want inference [parallelformers](https://github.com/tunib-ai/parallelformers) provides this support for most of our models. So until this is implemented in the core you can use theirs. And hopefully training mode will be supported too. +- Deepspeed-Inference also supports our BERT, GPT-2, and GPT-Neo models in their super-fast CUDA-kernel-based inference mode, see more [here](https://www.deepspeed.ai/tutorials/inference-tutorial/) + ## DP+PP