[parallelism doc] document Deepspeed-Inference and parallelformers (#12836)
* document Deepspeed-Inference and parallelformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
@@ -224,7 +224,11 @@ Implementations:
|
||||
- DeepSpeed calls it [tensor slicing](https://www.deepspeed.ai/features/#model-parallelism)
|
||||
- [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) has an internal implementation.
|
||||
|
||||
🤗 Transformers status: not yet implemented
|
||||
🤗 Transformers status:
|
||||
- core: not yet implemented in the core
|
||||
- but if you want inference [parallelformers](https://github.com/tunib-ai/parallelformers) provides this support for most of our models. So until this is implemented in the core you can use theirs. And hopefully training mode will be supported too.
|
||||
- Deepspeed-Inference also supports our BERT, GPT-2, and GPT-Neo models in their super-fast CUDA-kernel-based inference mode, see more [here](https://www.deepspeed.ai/tutorials/inference-tutorial/)
|
||||
|
||||
|
||||
|
||||
## DP+PP
|
||||
|
||||
Reference in New Issue
Block a user