Update installation page and add contributing to the doc (#5084)
* Update installation page and add contributing to the doc * Remove mention of symlinks
This commit is contained in:
1
docs/source/contributing.md
Symbolic link
1
docs/source/contributing.md
Symbolic link
@@ -0,0 +1 @@
|
||||
../../CONTRIBUTING.md
|
||||
@@ -142,6 +142,7 @@ conversion utilities for the following models:
|
||||
converting_tensorflow_models
|
||||
migration
|
||||
torchscript
|
||||
contributing
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
@@ -1,69 +1,102 @@
|
||||
# Installation
|
||||
|
||||
Transformers is tested on Python 3.6+ and PyTorch 1.1.0
|
||||
🤗 Transformers is tested on Python 3.6+, and PyTorch 1.1.0+ or TensorFlow 2.0+.
|
||||
|
||||
## With pip
|
||||
You should install 🤗 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're
|
||||
unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/). Create a virtual environment with the version of Python you're going
|
||||
to use and activate it.
|
||||
|
||||
PyTorch Transformers can be installed using pip as follows:
|
||||
Now, if you want to use 🤗 Transformers, you can install it with pip. If you'd like to play with the examples, you
|
||||
must install it from source.
|
||||
|
||||
``` bash
|
||||
## Installation with pip
|
||||
|
||||
First you need to install one of, or both, TensorFlow 2.0 and PyTorch.
|
||||
Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/pip#tensorflow-2.0-rc-is-available)
|
||||
and/or [PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) regarding the specific
|
||||
install command for your platform.
|
||||
|
||||
When TensorFlow 2.0 and/or PyTorch has been installed, 🤗 Transformers can be installed using pip as follows:
|
||||
|
||||
```bash
|
||||
pip install transformers
|
||||
```
|
||||
|
||||
## From source
|
||||
Alternatively, for CPU-support only, you can install 🤗 Transformers and PyTorch in one line with
|
||||
|
||||
To install from source, clone the repository and install with:
|
||||
```bash
|
||||
pip install transformers[torch]
|
||||
```
|
||||
|
||||
or 🤗 Transformers and TensorFlow 2.0 in one line with
|
||||
|
||||
```bash
|
||||
pip install transformers[tf-cpu]
|
||||
```
|
||||
|
||||
To check 🤗 Transformers is properly installed, run the following command:
|
||||
|
||||
```bash
|
||||
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
|
||||
```
|
||||
|
||||
It should download a pretrained model then print something like
|
||||
|
||||
```bash
|
||||
[{'label': 'NEGATIVE', 'score': 0.9991129040718079}]
|
||||
```
|
||||
|
||||
(Note that TensorFlow will print additional stuff before that last statement.)
|
||||
|
||||
## Installing from source
|
||||
|
||||
To install from source, clone the repository and install with the following commands:
|
||||
|
||||
``` bash
|
||||
git clone https://github.com/huggingface/transformers.git
|
||||
cd transformers
|
||||
pip install .
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
Again, you can run
|
||||
|
||||
```bash
|
||||
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
|
||||
```
|
||||
|
||||
to check 🤗 Transformers is properly installed.
|
||||
|
||||
## Caching models
|
||||
|
||||
This library provides pretrained models that will be downloaded and cached locally. Unless you specify a location with
|
||||
`cache_dir=...` when you use the `from_pretrained` method, these models will automatically be downloaded in the
|
||||
folder given by the shell environment variable ``TRANSFORMERS_CACHE``. The default value for it will be the PyTorch
|
||||
`cache_dir=...` when you use methods like `from_pretrained`, these models will automatically be downloaded in the
|
||||
folder given by the shell environment variable ``TRANSFORMERS_CACHE``. The default value for it will be the PyTorch
|
||||
cache home followed by ``/transformers/`` (even if you don't have PyTorch installed). This is (by order of priority):
|
||||
|
||||
* shell environment variable ``ENV_TORCH_HOME``
|
||||
* shell environment variable ``ENV_XDG_CACHE_HOME`` + ``/torch/``
|
||||
* default: ``~/.cache/torch/``
|
||||
|
||||
So if you don't have any specific environment variable set, the cache directory will be at
|
||||
So if you don't have any specific environment variable set, the cache directory will be at
|
||||
``~/.cache/torch/transformers/``.
|
||||
|
||||
**Note:** If you have set a shell enviromnent variable for one of the predecessors of this library
|
||||
(``PYTORCH_TRANSFORMERS_CACHE`` or ``PYTORCH_PRETRAINED_BERT_CACHE``), those will be used if there is no shell
|
||||
**Note:** If you have set a shell enviromnent variable for one of the predecessors of this library
|
||||
(``PYTORCH_TRANSFORMERS_CACHE`` or ``PYTORCH_PRETRAINED_BERT_CACHE``), those will be used if there is no shell
|
||||
enviromnent variable for ``TRANSFORMERS_CACHE``.
|
||||
|
||||
## Tests
|
||||
### Note on model downloads (Continuous Integration or large-scale deployments)
|
||||
|
||||
An extensive test suite is included to test the library behavior and several examples. Library tests can be found in the [tests folder](https://github.com/huggingface/transformers/tree/master/tests) and examples tests in the [examples folder](https://github.com/huggingface/transformers/tree/master/examples).
|
||||
|
||||
Refer to the [contributing guide](https://github.com/huggingface/transformers/blob/master/CONTRIBUTING.md#tests) for details about running tests.
|
||||
|
||||
## OpenAI GPT original tokenization workflow
|
||||
|
||||
If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install `ftfy` and `SpaCy`:
|
||||
|
||||
``` bash
|
||||
pip install spacy ftfy==4.4.3
|
||||
python -m spacy download en
|
||||
```
|
||||
|
||||
If you don't install `ftfy` and `SpaCy`, the `OpenAI GPT` tokenizer will default to tokenize using BERT's `BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't worry).
|
||||
|
||||
## Note on model downloads (Continuous Integration or large-scale deployments)
|
||||
|
||||
If you expect to be downloading large volumes of models (more than 1,000) from our hosted bucket (for instance through your CI setup, or a large-scale production deployment), please cache the model files on your end. It will be way faster, and cheaper. Feel free to contact us privately if you need any help.
|
||||
If you expect to be downloading large volumes of models (more than 1,000) from our hosted bucket (for instance through
|
||||
your CI setup, or a large-scale production deployment), please cache the model files on your end. It will be way
|
||||
faster, and cheaper. Feel free to contact us privately if you need any help.
|
||||
|
||||
## Do you want to run a Transformer model on a mobile device?
|
||||
|
||||
You should check out our [swift-coreml-transformers](https://github.com/huggingface/swift-coreml-transformers) repo.
|
||||
|
||||
It contains a set of tools to convert PyTorch or TensorFlow 2.0 trained Transformer models (currently contains `GPT-2`, `DistilGPT-2`, `BERT`, and `DistilBERT`) to CoreML models that run on iOS devices.
|
||||
It contains a set of tools to convert PyTorch or TensorFlow 2.0 trained Transformer models (currently contains `GPT-2`,
|
||||
`DistilGPT-2`, `BERT`, and `DistilBERT`) to CoreML models that run on iOS devices.
|
||||
|
||||
At some point in the future, you'll be able to seamlessly move from pre-training or fine-tuning models in PyTorch to productizing them in CoreML,
|
||||
or prototype a model or an app in CoreML then research its hyperparameters or architecture from PyTorch. Super exciting!
|
||||
At some point in the future, you'll be able to seamlessly move from pre-training or fine-tuning models in PyTorch or
|
||||
TensorFlow 2.0 to productizing them in CoreML, or prototype a model or an app in CoreML then research its
|
||||
hyperparameters or architecture from PyTorch or TensorFlow 2.0. Super exciting!
|
||||
|
||||
@@ -38,6 +38,17 @@ Hugging Face showcasing the generative capabilities of several models. GPT is on
|
||||
|
||||
The original code can be found `here <https://github.com/openai/finetune-transformer-lm>`_.
|
||||
|
||||
Note:
|
||||
|
||||
If you want to reproduce the original tokenization process of the `OpenAI GPT` paper, you will need to install
|
||||
``ftfy`` and ``SpaCy``::
|
||||
|
||||
pip install spacy ftfy==4.4.3
|
||||
python -m spacy download en
|
||||
|
||||
If you don't install ``ftfy`` and ``SpaCy``, the :class:`transformers.OpenAIGPTTokenizer` will default to tokenize using
|
||||
BERT's :obj:`BasicTokenizer` followed by Byte-Pair Encoding (which should be fine for most usage, don't
|
||||
worry).
|
||||
|
||||
OpenAIGPTConfig
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Reference in New Issue
Block a user