Use HF papers (#38184)
* Use hf papers * Hugging Face papers * doi to hf papers * style
This commit is contained in:
committed by
GitHub
parent
1031ed5166
commit
de24fb63ed
@@ -8,7 +8,7 @@ This example will demonstrate pre-training language models at the 100M-1B parame
|
||||
|
||||
We've tried to ensure that all the practices we show you here are scalable, though - with relatively few changes, the code could be scaled up to much larger models.
|
||||
|
||||
Google's gargantuan [PaLM model](https://arxiv.org/abs/2204.02311), with
|
||||
Google's gargantuan [PaLM model](https://huggingface.co/papers/2204.02311), with
|
||||
over 500B parameters, is a good example of how far you can go with pure TPU training, though gathering the dataset and the budget to train at that scale is not an easy task!
|
||||
|
||||
### Table of contents
|
||||
|
||||
Reference in New Issue
Block a user