diff --git a/docs/source/index.rst b/docs/source/index.rst index 8c76b89185..3b4fe4d1e8 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -46,8 +46,7 @@ The library currently contains PyTorch and Tensorflow implementations, pre-train 5. `XLNet `_ (from Google/CMU) released with the paper `​XLNet: Generalized Autoregressive Pretraining for Language Understanding `_ by Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le. 6. `XLM `_ (from Facebook) released together with the paper `Cross-lingual Language Model Pretraining `_ by Guillaume Lample and Alexis Conneau. 7. `RoBERTa `_ (from Facebook), released together with the paper a `Robustly Optimized BERT Pretraining Approach `_ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. -8. `DistilBERT `_ (from HuggingFace) released together with the blog post `Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT `_ by Victor Sanh, Lysandre Debut and Thomas Wolf. - +8. `DistilBERT `_ (from HuggingFace) released together with the paper `DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter `_ by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into `DistilGPT2 `_. .. toctree:: :maxdepth: 2