Fixed spelling of training (#4416)

This commit is contained in:
Soham Chatterjee
2020-05-18 23:23:29 +08:00
committed by GitHub
parent 757baee846
commit fa6113f9a0

View File

@@ -6,7 +6,7 @@ Overview
The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations <https://arxiv.org/abs/1909.11942>`_ The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations <https://arxiv.org/abs/1909.11942>`_
by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. It presents by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. It presents
two parameter-reduction techniques to lower memory consumption and increase the trainig speed of BERT: two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT:
- Splitting the embedding matrix into two smaller matrices - Splitting the embedding matrix into two smaller matrices
- Using repeating layers split among groups - Using repeating layers split among groups