From 27a8c9e4f189462c1cc4206317a27032887c8513 Mon Sep 17 00:00:00 2001
From: Stas Bekman <stas00@users.noreply.github.com>
Date: Wed, 21 Jul 2021 15:11:02 -0700
Subject: [PATCH] [parallelism doc] document Deepspeed-Inference and
 parallelformers (#12836)

* document Deepspeed-Inference and parallelformers

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---
 docs/source/parallelism.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/docs/source/parallelism.md b/docs/source/parallelism.md
index 084f381e27..1e5265bf5b 100644
--- a/docs/source/parallelism.md
+++ b/docs/source/parallelism.md
@@ -224,7 +224,11 @@ Implementations:
 - DeepSpeed calls it [tensor slicing](https://www.deepspeed.ai/features/#model-parallelism)
 - [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) has an internal implementation.
 
-🤗 Transformers status: not yet implemented
+🤗 Transformers status:
+- core: not yet implemented in the core
+- but if you want inference [parallelformers](https://github.com/tunib-ai/parallelformers) provides this support for most of our models. So until this is implemented in the core you can use theirs. And hopefully training mode will be supported too.
+- Deepspeed-Inference also supports our BERT, GPT-2, and GPT-Neo models in their super-fast CUDA-kernel-based inference mode, see more [here](https://www.deepspeed.ai/tutorials/inference-tutorial/)
+
 
 
 ## DP+PP