[parallelism doc] document Deepspeed-Inference and parallelformers (#12836)

* document Deepspeed-Inference and parallelformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-21 15:11:02 -07:00
parent 807b6bd160
commit 27a8c9e4f1
1 changed files with 5 additions and 1 deletions
--- a/docs/source/parallelism.md
+++ b/docs/source/parallelism.md
@@ -224,7 +224,11 @@ Implementations:
 - DeepSpeed calls it [tensor slicing](https://www.deepspeed.ai/features/#model-parallelism)
 - [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) has an internal implementation.

-🤗 Transformers status: not yet implemented
+🤗 Transformers status:
+- core: not yet implemented in the core
+- but if you want inference [parallelformers](https://github.com/tunib-ai/parallelformers) provides this support for most of our models. So until this is implemented in the core you can use theirs. And hopefully training mode will be supported too.
+- Deepspeed-Inference also supports our BERT, GPT-2, and GPT-Neo models in their super-fast CUDA-kernel-based inference mode, see more [here](https://www.deepspeed.ai/tutorials/inference-tutorial/)
+


 ## DP+PP