Use HF papers (#38184)
* Use hf papers * Hugging Face papers * doi to hf papers * style
This commit is contained in:
committed by
GitHub
parent
1031ed5166
commit
de24fb63ed
@@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
|
||||
|
||||
# DeepSpeed集成
|
||||
|
||||
[DeepSpeed](https://github.com/deepspeedai/DeepSpeed)实现了[ZeRO论文](https://arxiv.org/abs/1910.02054)中描述的所有内容。目前,它提供对以下功能的全面支持:
|
||||
[DeepSpeed](https://github.com/deepspeedai/DeepSpeed)实现了[ZeRO论文](https://huggingface.co/papers/1910.02054)中描述的所有内容。目前,它提供对以下功能的全面支持:
|
||||
|
||||
1. 优化器状态分区(ZeRO stage 1)
|
||||
2. 梯度分区(ZeRO stage 2)
|
||||
@@ -25,7 +25,7 @@ rendered properly in your Markdown viewer.
|
||||
5. 一系列基于CUDA扩展的快速优化器
|
||||
6. ZeRO-Offload 到 CPU 和 NVMe
|
||||
|
||||
ZeRO-Offload有其自己的专门论文:[ZeRO-Offload: Democratizing Billion-Scale Model Training](https://arxiv.org/abs/2101.06840)。而NVMe支持在论文[ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning](https://arxiv.org/abs/2104.07857)中进行了描述。
|
||||
ZeRO-Offload有其自己的专门论文:[ZeRO-Offload: Democratizing Billion-Scale Model Training](https://huggingface.co/papers/2101.06840)。而NVMe支持在论文[ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning](https://huggingface.co/papers/2104.07857)中进行了描述。
|
||||
|
||||
DeepSpeed ZeRO-2主要用于训练,因为它的特性对推理没有用处。
|
||||
|
||||
@@ -2093,8 +2093,8 @@ RUN_SLOW=1 pytest tests/deepspeed
|
||||
|
||||
论文:
|
||||
|
||||
- [ZeRO: Memory Optimizations Toward Training Trillion Parameter Models](https://arxiv.org/abs/1910.02054)
|
||||
- [ZeRO-Offload: Democratizing Billion-Scale Model Training](https://arxiv.org/abs/2101.06840)
|
||||
- [ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning](https://arxiv.org/abs/2104.07857)
|
||||
- [ZeRO: Memory Optimizations Toward Training Trillion Parameter Models](https://huggingface.co/papers/1910.02054)
|
||||
- [ZeRO-Offload: Democratizing Billion-Scale Model Training](https://huggingface.co/papers/2101.06840)
|
||||
- [ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning](https://huggingface.co/papers/2104.07857)
|
||||
|
||||
最后,请记住,HuggingFace [`Trainer`]仅集成了DeepSpeed,因此如果您在使用DeepSpeed时遇到任何问题或疑问,请在[DeepSpeed GitHub](https://github.com/deepspeedai/DeepSpeed/issues)上提交一个issue。
|
||||
|
||||
Reference in New Issue
Block a user