big doc update [WIP]
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
Converting Tensorflow Checkpoints
|
||||
================================================
|
||||
|
||||
A command-line interface is provided to convert a TensorFlow checkpoint in a PyTorch dump of the ``BertForPreTraining`` class (for BERT) or NumPy checkpoint in a PyTorch dump of the ``OpenAIGPTModel`` class (for OpenAI GPT).
|
||||
A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models than be loaded using the ``from_pretrained`` methods of the library.
|
||||
|
||||
BERT
|
||||
^^^^
|
||||
@@ -41,6 +41,20 @@ Here is an example of the conversion process for a pre-trained OpenAI GPT model,
|
||||
$PYTORCH_DUMP_OUTPUT \
|
||||
[OPENAI_GPT_CONFIG]
|
||||
|
||||
OpenAI GPT-2
|
||||
^^^^^^^^^^^^
|
||||
|
||||
Here is an example of the conversion process for a pre-trained OpenAI GPT-2 model (see `here <https://github.com/openai/gpt-2>`__\ )
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
export OPENAI_GPT2_CHECKPOINT_PATH=/path/to/gpt2/pretrained/weights
|
||||
|
||||
pytorch_transformers gpt2 \
|
||||
$OPENAI_GPT2_CHECKPOINT_PATH \
|
||||
$PYTORCH_DUMP_OUTPUT \
|
||||
[OPENAI_GPT2_CONFIG]
|
||||
|
||||
Transformer-XL
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
@@ -55,19 +69,6 @@ Here is an example of the conversion process for a pre-trained Transformer-XL mo
|
||||
$PYTORCH_DUMP_OUTPUT \
|
||||
[TRANSFO_XL_CONFIG]
|
||||
|
||||
GPT-2
|
||||
^^^^^
|
||||
|
||||
Here is an example of the conversion process for a pre-trained OpenAI's GPT-2 model.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
export GPT2_DIR=/path/to/gpt2/checkpoint
|
||||
|
||||
pytorch_transformers gpt2 \
|
||||
$GPT2_DIR/model.ckpt \
|
||||
$PYTORCH_DUMP_OUTPUT \
|
||||
[GPT2_CONFIG]
|
||||
|
||||
XLNet
|
||||
^^^^^
|
||||
@@ -84,3 +85,17 @@ Here is an example of the conversion process for a pre-trained XLNet model, fine
|
||||
$TRANSFO_XL_CONFIG_PATH \
|
||||
$PYTORCH_DUMP_OUTPUT \
|
||||
STS-B \
|
||||
|
||||
|
||||
XLM
|
||||
^^^
|
||||
|
||||
Here is an example of the conversion process for a pre-trained XLM model:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
export XLM_CHECKPOINT_PATH=/path/to/xlm/checkpoint
|
||||
|
||||
pytorch_transformers xlm \
|
||||
$XLM_CHECKPOINT_PATH \
|
||||
$PYTORCH_DUMP_OUTPUT \
|
||||
|
||||
@@ -21,11 +21,20 @@ The library currently contains PyTorch implementations, pre-trained model weight
|
||||
pretrained_models
|
||||
examples
|
||||
notebooks
|
||||
serialization
|
||||
converting_tensorflow_models
|
||||
migration
|
||||
bertology
|
||||
torchscript
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Main classes
|
||||
|
||||
main_classes/configuration
|
||||
main_classes/model
|
||||
main_classes/tokenizer
|
||||
main_classes/optimizer_schedules
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
@@ -1,12 +1,12 @@
|
||||
Installation
|
||||
================================================
|
||||
|
||||
This repo was tested on Python 2.7 and 3.5+ (examples are tested only on python 3.5+) and PyTorch 0.4.1/1.0.0
|
||||
PyTorch-Transformers is tested on Python 2.7 and 3.5+ (examples are tested only on python 3.5+) and PyTorch 1.1.0
|
||||
|
||||
With pip
|
||||
^^^^^^^^
|
||||
|
||||
PyTorch pretrained bert can be installed with pip as follows:
|
||||
PyTorch Transformers can be installed using pip as follows:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
@@ -15,7 +15,7 @@ PyTorch pretrained bert can be installed with pip as follows:
|
||||
From source
|
||||
^^^^^^^^^^^
|
||||
|
||||
Clone the repository and instal locally:
|
||||
To install from source, clone the repository and install with:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
@@ -27,11 +27,11 @@ Clone the repository and instal locally:
|
||||
Tests
|
||||
^^^^^
|
||||
|
||||
An extensive test suite is included for the library and the example scripts. Library tests can be found in the `tests folder <https://github.com/huggingface/pytorch-transformers/tree/master/pytorch_transformers/tests>`_ and examples tests in the `examples folder <https://github.com/huggingface/pytorch-transformers/tree/master/examples>`_.
|
||||
An extensive test suite is included to test the library behavior and several examples. Library tests can be found in the `tests folder <https://github.com/huggingface/pytorch-transformers/tree/master/pytorch_transformers/tests>`_ and examples tests in the `examples folder <https://github.com/huggingface/pytorch-transformers/tree/master/examples>`_.
|
||||
|
||||
These tests can be run using `pytest` (install pytest if needed with `pip install pytest`).
|
||||
Tests can be run using `pytest` (install pytest if needed with `pip install pytest`).
|
||||
|
||||
You can run the tests from the root of the cloned repository with the commands:
|
||||
Run all the tests from the root of the cloned repository with the commands:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
@@ -42,11 +42,11 @@ You can run the tests from the root of the cloned repository with the commands:
|
||||
OpenAI GPT original tokenization workflow
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
If you want to reproduce the original tokenization process of the ``OpenAI GPT`` paper, you will need to install ``ftfy`` (limit to version 4.4.3 if you are using Python 2) and ``SpaCy`` :
|
||||
If you want to reproduce the original tokenization process of the ``OpenAI GPT`` paper, you will need to install ``ftfy`` (use version 4.4.3 if you are using Python 2) and ``SpaCy`` :
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install spacy ftfy==4.4.3
|
||||
python -m spacy download en
|
||||
|
||||
If you don't install ``ftfy`` and ``SpaCy``\ , the ``OpenAI GPT`` tokenizer will default to tokenize using BERT's ``BasicTokenizer`` followed by Byte-Pair Encoding (which should be fine for most usage, don't worry).
|
||||
If you don't install ``ftfy`` and ``SpaCy``\ , the ``OpenAI GPT`` tokenizer defaults to tokenize using BERT's ``BasicTokenizer`` followed by Byte-Pair Encoding (which should be fine for most usage, don't worry).
|
||||
|
||||
10
docs/source/main_classes/configuration.rst
Normal file
10
docs/source/main_classes/configuration.rst
Normal file
@@ -0,0 +1,10 @@
|
||||
Configuration
|
||||
----------------------------------------------------
|
||||
|
||||
We provide a base class, ``PretrainedConfig``, which can load a pretrained instance either from a local file or directory or from a pretrained model configuration provided by the library (downloaded from HuggingFace AWS S3 repository).
|
||||
|
||||
``PretrainedConfig``
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: pytorch_transformers.PretrainedConfig
|
||||
:members:
|
||||
8
docs/source/main_classes/model.rst
Normal file
8
docs/source/main_classes/model.rst
Normal file
@@ -0,0 +1,8 @@
|
||||
Models
|
||||
----------------------------------------------------
|
||||
|
||||
``PreTrainedModel``
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: pytorch_transformers.PreTrainedModel
|
||||
:members:
|
||||
26
docs/source/main_classes/optimizer_schedules.rst
Normal file
26
docs/source/main_classes/optimizer_schedules.rst
Normal file
@@ -0,0 +1,26 @@
|
||||
Optimizer
|
||||
----------------------------------------------------
|
||||
|
||||
``AdamW``
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: pytorch_transformers.AdamW
|
||||
:members:
|
||||
|
||||
Schedules
|
||||
----------------------------------------------------
|
||||
|
||||
.. autoclass:: pytorch_transformers.ConstantLRSchedule
|
||||
:members:
|
||||
|
||||
.. autoclass:: pytorch_transformers.WarmupConstantSchedule
|
||||
:members:
|
||||
|
||||
.. autoclass:: pytorch_transformers.WarmupCosineSchedule
|
||||
:members:
|
||||
|
||||
.. autoclass:: pytorch_transformers.WarmupCosineWithHardRestartsSchedule
|
||||
:members:
|
||||
|
||||
.. autoclass:: pytorch_transformers.WarmupLinearSchedule
|
||||
:members:
|
||||
8
docs/source/main_classes/tokenizer.rst
Normal file
8
docs/source/main_classes/tokenizer.rst
Normal file
@@ -0,0 +1,8 @@
|
||||
Tokenizer
|
||||
----------------------------------------------------
|
||||
|
||||
``PreTrainedTokenizer``
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: pytorch_transformers.PreTrainedTokenizer
|
||||
:members:
|
||||
@@ -35,10 +35,13 @@ loss, logits, attentions = outputs
|
||||
|
||||
### Serialization
|
||||
|
||||
Breaking change: Models are now set in evaluation mode by default when instantiated with the `from_pretrained()` method.
|
||||
To train them don't forget to set them back in training mode (`model.train()`) to activate the dropout modules.
|
||||
Breaking change in the `from_pretrained()`method:
|
||||
|
||||
Also, while not a breaking change, the serialization methods have been standardized and you probably should switch to the new method `save_pretrained(save_directory)` if you were using any other seralization method before.
|
||||
1. Models are now set in evaluation mode by default when instantiated with the `from_pretrained()` method. To train them don't forget to set them back in training mode (`model.train()`) to activate the dropout modules.
|
||||
|
||||
2. The additional `*inputs` and `**kwargs` arguments supplied to the `from_pretrained()` method used to be directly passed to the underlying model's class `__init__()` method. They are now used to update the model configuration attribute first which can break derived model classes build based on the previous `BertForSequenceClassification` examples. More precisely, the positional arguments `*inputs` provided to `from_pretrained()` are directly forwarded the model `__init__()` method while the keyword arguments `**kwargs` (i) which match configuration class attributes are used to update said attributes (ii) which don't match any configuration class attributes are forwarded to the model `__init__()` method.
|
||||
|
||||
Also, while not a breaking change, the serialization methods have been standardized and you probably should switch to the new method `save_pretrained(save_directory)` if you were using any other serialization method before.
|
||||
|
||||
Here is an example:
|
||||
|
||||
|
||||
@@ -15,12 +15,6 @@ BERT
|
||||
:members:
|
||||
|
||||
|
||||
``AdamW``
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: pytorch_transformers.AdamW
|
||||
:members:
|
||||
|
||||
``BertModel``
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
||||
@@ -1,17 +1,58 @@
|
||||
# Quickstart
|
||||
|
||||
## Philosophy
|
||||
|
||||
PyTorch-Transformers is an opinionated library built for NLP researchers seeking to use/study/extend large-scale transformers models.
|
||||
|
||||
The library was designed with two strong goals in mind:
|
||||
|
||||
- be as easy and fast to use as possible:
|
||||
|
||||
- we strongly limited the number of abstractions to learn, in fact there are almost no abstractions, just three standard classes for each model: configuration, models and tokenizer,
|
||||
- each pretrained model configuration, weights and vocabulary can be downloaded, cached and loaded in the related class in a simple way by using a common `from_pretrained()` instantiation method.
|
||||
- this library is NOT a modular toolbox of building blocks for neural nets, to extend/build-upon the library, just use your regular Python/PyTorch modules and inherit from the base classes of the library to reuse functionalities like model loading/saving.
|
||||
|
||||
- provide state-of-the-art models with performances as close as possible to the original models:
|
||||
|
||||
- we provide at least one example for each model which reproduces a result provided by the official authors of said model,
|
||||
- the code is usually as close to the original code base as possible which means some PyTorch code may be not as *pytorchic* as it could be as a result of being converted TensorFlow code.
|
||||
|
||||
A few other goals:
|
||||
|
||||
- expose the models internals as consistently as possible:
|
||||
|
||||
- we give access, using a single API to the full hidden-states and attention weights,
|
||||
- tokenizer and base model's API are standardized to easily switch between models.
|
||||
|
||||
- incorporate a subjective selection of promising tools for fine-tuning/investiguating these models:
|
||||
|
||||
- a simple/consistent way to add new tokens to the vocabulary and embeddings for fine-tuning,
|
||||
- simple ways to mask and prune transformer heads.
|
||||
|
||||
## Main concepts
|
||||
|
||||
The library is build around three type of classes for each models:
|
||||
|
||||
- **model classes** which are PyTorch models (`torch.nn.Modules`) of the 6 models architectures currently provided in the library, e.g. `BertModel`
|
||||
- **configuration classes** which store all the parameters required to build a model, e.g. `BertConfig`
|
||||
- **tokenizer classes** which store the vocabulary for each model and provide methods for encoding strings in list of token embeddings indices to be fed to a model, e.g. `BertTokenizer`
|
||||
|
||||
All these classes can be instantiated from pretrained instances and saved locally using two methods:
|
||||
|
||||
- `from_pretrained()` let you instantiate a model/configuration/tokenizer from a pretrained version either provided by the library itself (currently 27 models are provided as listed [here](https://huggingface.co/pytorch-transformers/pretrained_models.html)) or stored locally (or on a server) by the user,
|
||||
- `save_pretrained()` let you save a model/configuration/tokenizer locally so that it can be reloaded using `from_pretrained()`.
|
||||
|
||||
Let's go through a few simple quick-start examples to see how we can instantiate and use these classes.
|
||||
|
||||
## Quick tour: Usage
|
||||
|
||||
Here are two quick-start examples showcasing a few `Bert` and `GPT2` classes and pre-trained models.
|
||||
Here are two examples showcasing a few `Bert` and `GPT2` classes and pre-trained models.
|
||||
|
||||
See package reference for examples for each model classe.
|
||||
See full API reference for examples for each model classe.
|
||||
|
||||
### BERT example
|
||||
|
||||
First let's prepare a tokenized input from a text string using `BertTokenizer`
|
||||
Let's start by preparing a tokenized input (a list of token embeddings indices to be fed to Bert) from a text string using `BertTokenizer`
|
||||
|
||||
```python
|
||||
import torch
|
||||
|
||||
@@ -1,3 +1,6 @@
|
||||
Serialization
|
||||
----------------------------------------------------
|
||||
|
||||
### Loading Google AI or OpenAI pre-trained weights or PyTorch dump
|
||||
|
||||
### `from_pretrained()` method
|
||||
|
||||
Reference in New Issue
Block a user