Fix typos (#31819)
* fix typo * fix typo * fix typos * fix typo * fix typos
This commit is contained in:
@@ -139,7 +139,7 @@ reading the whole sentence with a mask to hide future tokens at a certain timest
|
||||
|
||||
### deep learning (DL)
|
||||
|
||||
Machine learning algorithms which uses neural networks with several layers.
|
||||
Machine learning algorithms which use neural networks with several layers.
|
||||
|
||||
## E
|
||||
|
||||
@@ -519,4 +519,4 @@ A form of model training in which data provided to the model is not labeled. Uns
|
||||
Parallelism technique which performs sharding of the tensors somewhat similar to [TensorParallel](#tensor-parallelism-tp),
|
||||
except the whole tensor gets reconstructed in time for a forward or backward computation, therefore the model doesn't need
|
||||
to be modified. This method also supports various offloading techniques to compensate for limited GPU memory.
|
||||
Learn more about ZeRO [here](perf_train_gpu_many#zero-data-parallelism).
|
||||
Learn more about ZeRO [here](perf_train_gpu_many#zero-data-parallelism).
|
||||
|
||||
Reference in New Issue
Block a user