From 066fd047cc644e3418955b8efd956093ad9f6267 Mon Sep 17 00:00:00 2001 From: Stas Bekman Date: Tue, 31 Aug 2021 03:47:23 -0700 Subject: [PATCH] correct TP implementation resources (#13248) fix a few implementation links --- docs/source/parallelism.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/source/parallelism.md b/docs/source/parallelism.md index 1e5265bf5b..28b0822e2a 100644 --- a/docs/source/parallelism.md +++ b/docs/source/parallelism.md @@ -220,9 +220,12 @@ Special considerations: TP requires very fast network, and therefore it's not ad This section is based on the original much more [detailed TP overview](https://github.com/huggingface/transformers/issues/10321#issuecomment-783543530). by [@anton-l](https://github.com/anton-l). -Implementations: +Alternative names: - DeepSpeed calls it [tensor slicing](https://www.deepspeed.ai/features/#model-parallelism) -- [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) has an internal implementation. + +Implementations: +- [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) has an internal implementation, as it's very model-specific +- [parallelformers](https://github.com/tunib-ai/parallelformers) (only inference at the moment) 🤗 Transformers status: - core: not yet implemented in the core