diff --git a/docs/source/index.rst b/docs/source/index.rst index 12c670ed06..bc94c6d5f7 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,17 +1,18 @@ Transformers ================================================================================================================================================ -🤗 Transformers (formerly known as `pytorch-transformers` and `pytorch-pretrained-bert`) provides general-purpose architectures -(BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet...) for Natural Language Understanding (NLU) and Natural Language Generation -(NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch. +State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0. -This is the documentation of our repository `transformers `__. +🤗 Transformers (formerly known as `pytorch-transformers` and `pytorch-pretrained-bert`) provides general-purpose +architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet...) for Natural Language Understanding (NLU) and Natural +Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between +TensorFlow 2.0 and PyTorch. + +This is the documentation of our repository `transformers `_. Features --------------------------------------------------- -- As easy to use as pytorch-transformers -- As powerful and concise as Keras - High performance on NLU and NLG tasks - Low barrier to entry for educators and practitioners @@ -37,46 +38,121 @@ Choose the right framework for every part of a model's lifetime: Contents --------------------------------- -The library currently contains PyTorch and Tensorflow implementations, pre-trained model weights, usage scripts and conversion utilities for the following models: +The library currently contains PyTorch and Tensorflow implementations, pre-trained model weights, usage scripts and +conversion utilities for the following models: -1. `BERT `_ (from Google) released with the paper `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding `_ by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. -2. `GPT `_ (from OpenAI) released with the paper `Improving Language Understanding by Generative Pre-Training `_ by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever. -3. `GPT-2 `_ (from OpenAI) released with the paper `Language Models are Unsupervised Multitask Learners `_ by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**. -4. `Transformer-XL `_ (from Google/CMU) released with the paper `Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context `_ by Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov. -5. `XLNet `_ (from Google/CMU) released with the paper `​XLNet: Generalized Autoregressive Pretraining for Language Understanding `_ by Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le. -6. `XLM `_ (from Facebook) released together with the paper `Cross-lingual Language Model Pretraining `_ by Guillaume Lample and Alexis Conneau. -7. `RoBERTa `_ (from Facebook), released together with the paper a `Robustly Optimized BERT Pretraining Approach `_ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. -8. `DistilBERT `_ (from HuggingFace) released together with the paper `DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter `_ by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into `DistilGPT2 `_. -9. `CTRL `_ (from Salesforce), released together with the paper `CTRL: A Conditional Transformer Language Model for Controllable Generation `_ by Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher. -10. `CamemBERT `_ (from FAIR, Inria, Sorbonne Université) released together with the paper `CamemBERT: a Tasty French Language Model `_ by Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suarez, Yoann Dupont, Laurent Romary, Eric Villemonte de la Clergerie, Djame Seddah, and Benoît Sagot. -11. `ALBERT `_ (from Google Research), released together with the paper a `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations `_ by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. -12. `XLM-RoBERTa `_ (from Facebook AI), released together with the paper `Unsupervised Cross-lingual Representation Learning at Scale `_ by Alexis Conneau*, Kartikay Khandelwal*, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. -13. `FlauBERT `_ (from CNRS) released with the paper `FlauBERT: Unsupervised Language Model Pre-training for French `_ by Hang Le, Loïc Vial, Jibril Frej, Vincent Segonne, Maximin Coavoux, Benjamin Lecouteux, Alexandre Allauzen, Benoît Crabbé, Laurent Besacier, Didier Schwab. +1. `BERT `_ (from Google) released with the paper `BERT: Pre-training of Deep + Bidirectional Transformers for Language Understanding `_ by Jacob Devlin, Ming-Wei + Chang, Kenton Lee, and Kristina Toutanova. +2. `GPT `_ (from OpenAI) released with the paper `Improving Language + Understanding by Generative Pre-Training `_ by Alec Radford, Karthik + Narasimhan, Tim Salimans, and Ilya Sutskever. +3. `GPT-2 `_ (from OpenAI) released with the paper `Language Models are + Unsupervised Multitask Learners `_ by Alec Radford, Jeffrey Wu, + Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. +4. `Transformer-XL `_ (from Google/CMU) released with the paper + `Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context `_ by + Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, and Ruslan Salakhutdinov. +5. `XLNet `_ (from Google/CMU) released with the paper `​XLNet: Generalized + Autoregressive Pretraining for Language Understanding `_ by Zhilin Yang, Zihang + Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. +6. `XLM `_ (from Facebook) released together with the paper `Cross-lingual + Language Model Pretraining `_ by Guillaume Lample and Alexis Conneau. +7. `RoBERTa `_ (from Facebook), released together with + the paper a `Robustly Optimized BERT Pretraining Approach `_ by Yinhan Liu, Myle + Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin + Stoyanov. +8. `DistilBERT `_ (from HuggingFace) released together + with the paper `DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter + `_ by Victor Sanh, Lysandre Debut, and Thomas Wolf. The same method has been + applied to compress GPT2 into + `DistilGPT2 `_. +9. `CTRL `_ (from Salesforce), released together with the + paper `CTRL: A Conditional Transformer Language Model for Controllable Generation + `_ by Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong, + and Richard Socher. +10. `CamemBERT `_ (from FAIR, Inria, Sorbonne Université) + released together with the paper `CamemBERT: a Tasty French Language Model `_ by + Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suarez, Yoann Dupont, Laurent Romary, Eric Villemonte de la + Clergerie, Djame Seddah, and Benoît Sagot. +11. `ALBERT `_ (from Google Research), released together with the paper + `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations `_ + by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. +12. `T5 `_ (from Google) released with the paper + `Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer + `_ by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, + Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. +13. `XLM-RoBERTa `_ (from Facebook AI), released together + with the paper `Unsupervised Cross-lingual Representation Learning at Scale `_ by + Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard + Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. +14. `MMBT `_ (from Facebook), released together with the paper a `Supervised + Multimodal Bitransformers for Classifying Images and Text `_ by Douwe Kiela, + Suvrat Bhooshan, Hamed Firooz, and Davide Testuggine. +15. `FlauBERT `_ (from CNRS) released with the paper `FlauBERT: Unsupervised + Language Model Pre-training for French `_ by Hang Le, Loïc Vial, Jibril Frej, + Vincent Segonne, Maximin Coavoux, Benjamin Lecouteux, Alexandre Allauzen, Benoît Crabbé, Laurent Besacier, and + Didier Schwab. +16. `BART `_ (from Facebook) released with the paper + `BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension + `_ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman + Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. +17. `ELECTRA `_ (from Google Research/Stanford University) released with + the paper `ELECTRA: Pre-training text encoders as discriminators rather than generators + `_ by Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. +18. `DialoGPT `_ (from Microsoft Research) released with the paper `DialoGPT: + Large-Scale Generative Pre-training for Conversational Response Generation `_ by + Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, + and Bill Dolan. +19. `Reformer `_ (from Google Research) released with + the paper `Reformer: The Efficient Transformer `_ by Nikita Kitaev, Łukasz + Kaiser, and Anselm Levskaya. +20. `MarianMT `_ (developed by the Microsoft Translator Team) machine translation models + trained using `OPUS `_ pretrained_models data by Jörg Tiedemann. +21. `Longformer `_ (from AllenAI) released with the paper `Longformer: The + Long-Document Transformer `_ by Iz Beltagy, Matthew E. Peters, and Arman Cohan. +22. `Other community models `_, contributed by the `community + `_. .. toctree:: :maxdepth: 2 - :caption: Notes + :caption: Get started installation quickstart glossary - summary - pretrained_models + +.. toctree:: + :maxdepth: 2 + :caption: Using Transformers + usage + summary + serialization model_sharing + multilingual + +.. toctree:: + :maxdepth: 2 + :caption: Advanced guides + + pretrained_models examples notebooks - serialization converting_tensorflow_models migration - bertology torchscript - multilingual + +.. toctree:: + :maxdepth: 2 + :caption: Research + + bertology benchmarks .. toctree:: :maxdepth: 2 - :caption: Main classes + :caption: Package Reference main_classes/configuration main_classes/model @@ -84,11 +160,6 @@ The library currently contains PyTorch and Tensorflow implementations, pre-train main_classes/pipelines main_classes/optimizer_schedules main_classes/processors - -.. toctree:: - :maxdepth: 2 - :caption: Package Reference - model_doc/auto model_doc/encoderdecoder model_doc/bert