Examples reorg (#11350)

* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
This commit is contained in:
Sylvain Gugger
2021-04-21 11:11:20 -04:00
committed by GitHub
parent ca7ff64f5b
commit dabeb15292
105 changed files with 1062 additions and 560 deletions

View File

@@ -65,10 +65,10 @@ respectively.
.. code-block:: bash
## PYTORCH CODE
python examples/benchmarking/run_benchmark.py --help
python examples/pytorch/benchmarking/run_benchmark.py --help
## TENSORFLOW CODE
python examples/benchmarking/run_benchmark_tf.py --help
python examples/tensorflow/benchmarking/run_benchmark_tf.py --help
An instantiated benchmark object can then simply be run by calling ``benchmark.run()``.

View File

@@ -33,8 +33,8 @@ You can convert any TensorFlow checkpoint for BERT (in particular `the pre-train
This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated
configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights
from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that
can be imported using ``from_pretrained()`` (see example in :doc:`quicktour` , `run_glue.py
<https://github.com/huggingface/transformers/blob/master/examples/text-classification/run_glue.py>`_\ ).
can be imported using ``from_pretrained()`` (see example in :doc:`quicktour` , :prefix_link:`run_glue.py
<examples/pytorch/text-classification/run_glue.py>` \ ).
You only need to run this conversion script **once** to get a PyTorch model. You can then disregard the TensorFlow
checkpoint (the three files starting with ``bert_model.ckpt``\ ) but be sure to keep the configuration file (\

View File

@@ -168,13 +168,13 @@ Here is an example of how this can be used on a filesystem that is shared betwee
On the instance with the normal network run your program which will download and cache models (and optionally datasets if you use 🤗 Datasets). For example:
```
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
```
and then with the same filesystem you can now run the same program on a firewalled instance:
```
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
```
and it should succeed without any hanging waiting to timeout.

View File

@@ -68,8 +68,8 @@ Additionally, the following method can be used to load values from a data file a
Example usage
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
An example using these processors is given in the `run_glue.py
<https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_glue.py>`__ script.
An example using these processors is given in the :prefix_link:`run_glue.py
<examples/legacy/text-classification/run_glue.py>` script.
XNLI
@@ -89,8 +89,8 @@ This library hosts the processor to load the XNLI data:
Please note that since the gold labels are available on the test set, evaluation is performed on the test set.
An example using these processors is given in the `run_xnli.py
<https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_xnli.py>`__ script.
An example using these processors is given in the :prefix_link:`run_xnli.py
<examples/legacy/text-classification/run_xnli.py>` script.
SQuAD
@@ -169,4 +169,4 @@ Using `tensorflow_datasets` is as easy as using a data file:
Another example using these processors is given in the :prefix_link:`run_squad.py
<examples/question-answering/run_squad.py>` script.
<examples/legacy/question-answering/run_squad.py>` script.

View File

@@ -338,7 +338,7 @@ For example here is how you could use it for ``run_translation.py`` with 2 GPUs:
.. code-block:: bash
python -m torch.distributed.launch --nproc_per_node=2 examples/seq2seq/run_translation.py \
python -m torch.distributed.launch --nproc_per_node=2 examples/pytorch/translation/run_translation.py \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir \
--do_train --max_train_samples 500 --num_train_epochs 1 \
@@ -363,7 +363,7 @@ For example here is how you could use it for ``run_translation.py`` with 2 GPUs:
.. code-block:: bash
python -m torch.distributed.launch --nproc_per_node=2 examples/seq2seq/run_translation.py \
python -m torch.distributed.launch --nproc_per_node=2 examples/pytorch/translation/run_translation.py \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir \
--do_train --max_train_samples 500 --num_train_epochs 1 \
@@ -540,7 +540,7 @@ Here is an example of running ``run_translation.py`` under DeepSpeed deploying a
.. code-block:: bash
deepspeed examples/seq2seq/run_translation.py \
deepspeed examples/pytorch/translation/run_translation.py \
--deepspeed tests/deepspeed/ds_config.json \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir --fp16 \
@@ -565,7 +565,7 @@ To deploy DeepSpeed with one GPU adjust the :class:`~transformers.Trainer` comma
.. code-block:: bash
deepspeed --num_gpus=1 examples/seq2seq/run_translation.py \
deepspeed --num_gpus=1 examples/pytorch/translation/run_translation.py \
--deepspeed tests/deepspeed/ds_config.json \
--model_name_or_path t5-small --per_device_train_batch_size 1 \
--output_dir output_dir --overwrite_output_dir --fp16 \
@@ -617,7 +617,7 @@ Notes:
.. code-block:: bash
deepspeed --include localhost:1 examples/seq2seq/run_translation.py ...
deepspeed --include localhost:1 examples/pytorch/translation/run_translation.py ...
In this example, we tell DeepSpeed to use GPU 1 (second gpu).
@@ -711,7 +711,7 @@ shell from a cell. For example, to use ``run_translation.py`` you would launch i
.. code-block::
!git clone https://github.com/huggingface/transformers
!cd transformers; deepspeed examples/seq2seq/run_translation.py ...
!cd transformers; deepspeed examples/pytorch/translation/run_translation.py ...
or with ``%%bash`` magic, where you can write a multi-line code for the shell program to run:
@@ -721,7 +721,7 @@ or with ``%%bash`` magic, where you can write a multi-line code for the shell pr
git clone https://github.com/huggingface/transformers
cd transformers
deepspeed examples/seq2seq/run_translation.py ...
deepspeed examples/pytorch/translation/run_translation.py ...
In such case you don't need any of the code presented at the beginning of this section.

View File

@@ -43,7 +43,7 @@ Examples
_______________________________________________________________________________________________________________________
- Examples and scripts for fine-tuning BART and other models for sequence to sequence tasks can be found in
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
:prefix_link:`examples/pytorch/summarization/ <examples/pytorch/summarization/README.md>`.
- An example of how to train :class:`~transformers.BartForConditionalGeneration` with a Hugging Face :obj:`datasets`
object can be found in this `forum discussion
<https://discuss.huggingface.co/t/train-bart-for-conditional-generation-e-g-summarization/1904>`__.

View File

@@ -43,7 +43,7 @@ Examples
_______________________________________________________________________________________________________________________
- BARThez can be fine-tuned on sequence-to-sequence tasks in a similar way as BART, check:
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
:prefix_link:`examples/pytorch/summarization/ <examples/pytorch/summarization/README.md>`.
BarthezTokenizer

View File

@@ -44,8 +44,8 @@ Tips:
- DistilBERT doesn't have options to select the input positions (:obj:`position_ids` input). This could be added if
necessary though, just let us know if you need this option.
This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found `here
<https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found
:prefix_link:`here <examples/research-projects/distillation>`.
DistilBertConfig

View File

@@ -53,7 +53,8 @@ Examples
_______________________________________________________________________________________________________________________
- :prefix_link:`Script <examples/research_projects/seq2seq-distillation/finetune_pegasus_xsum.sh>` to fine-tune pegasus
on the XSUM dataset. Data download instructions at :prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
on the XSUM dataset. Data download instructions at :prefix_link:`examples/pytorch/summarization/
<examples/pytorch/summarization/README.md>`.
- FP16 is not supported (help/ideas on this appreciated!).
- The adafactor optimizer is recommended for pegasus fine-tuning.

View File

@@ -21,7 +21,7 @@ Question Answering <https://yjernite.github.io/lfqa.html>`__. RetriBERT is a sma
pair of BERT encoders with lower-dimension projection for dense semantic indexing of text.
This model was contributed by `yjernite <https://huggingface.co/yjernite>`__. Code to train and use the model can be
found `here <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
found :prefix_link:`here <examples/research-projects/distillation>`.
RetriBertConfig

View File

@@ -41,7 +41,7 @@ Tips:
using only a sub-set of the output tokens as target which are selected with the :obj:`target_mapping` input.
- To use XLNet for sequential decoding (i.e. not in fully bi-directional setting), use the :obj:`perm_mask` and
:obj:`target_mapping` inputs to control the attention span and outputs (see examples in
`examples/text-generation/run_generation.py`)
`examples/pytorch/text-generation/run_generation.py`)
- XLNet is one of the few models that has no sequence length limit.
This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here

View File

@@ -682,7 +682,8 @@ The `mbart-large-en-ro checkpoint <https://huggingface.co/facebook/mbart-large-e
romanian translation.
The `mbart-large-cc25 <https://huggingface.co/facebook/mbart-large-cc25>`_ checkpoint can be finetuned for other
translation and summarization tasks, using code in ```examples/seq2seq/``` , but is not very useful without finetuning.
translation and summarization tasks, using code in ```examples/pytorch/translation/``` , but is not very useful without
finetuning.
ProphetNet

View File

@@ -90,8 +90,8 @@ You can then feed it all as input to your model:
>>> outputs = model(input_ids, langs=langs)
The example :prefix_link:`run_generation.py <examples/text-generation/run_generation.py>` can generate text using the
CLM checkpoints from XLM, using the language embeddings.
The example :prefix_link:`run_generation.py <examples/pytorch/text-generation/run_generation.py>` can generate text
using the CLM checkpoints from XLM, using the language embeddings.
XLM without Language Embeddings
-----------------------------------------------------------------------------------------------------------------------

View File

@@ -325,7 +325,7 @@ When you create a `HuggingFace` Estimator, you can specify a [training script th
If you are using `git_config` to run the [🤗 Transformers examples scripts](https://github.com/huggingface/transformers/tree/master/examples) keep in mind that you need to configure the right `'branch'` for you `transformers_version`, e.g. if you use `transformers_version='4.4.2` you have to use `'branch':'v4.4.2'`.
As an example to use `git_config` with an [example script from the transformers repository](https://github.com/huggingface/transformers/tree/master/examples/text-classification).
As an example to use `git_config` with an [example script from the transformers repository](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification).
_Tip: define `output_dir` as `/opt/ml/model` in the hyperparameter for the script to save your model to S3 after training._
@@ -338,7 +338,7 @@ git_config = {'repo': 'https://github.com/huggingface/transformers.git','branch'
# create the Estimator
huggingface_estimator = HuggingFace(
entry_point='run_glue.py',
source_dir='./examples/text-classification',
source_dir='./examples/pytorch/text-classification',
git_config=git_config,
instance_type='ml.p3.2xlarge',
instance_count=1,

View File

@@ -55,10 +55,10 @@ Sequence Classification
Sequence classification is the task of classifying sequences according to a given number of classes. An example of
sequence classification is the GLUE dataset, which is entirely based on that task. If you would like to fine-tune a
model on a GLUE sequence classification task, you may leverage the :prefix_link:`run_glue.py
<examples/text-classification/run_glue.py>`, :prefix_link:`run_tf_glue.py
<examples/text-classification/run_tf_glue.py>`, :prefix_link:`run_tf_text_classification.py
<examples/text-classification/run_tf_text_classification.py>` or :prefix_link:`run_xnli.py
<examples/text-classification/run_xnli.py>` scripts.
<examples/pytorch/text-classification/run_glue.py>`, :prefix_link:`run_tf_glue.py
<examples/tensorflow/text-classification/run_tf_glue.py>`, :prefix_link:`run_tf_text_classification.py
<examples/tensorflow/text-classification/run_tf_text_classification.py>` or :prefix_link:`run_xnli.py
<examples/pytorch/text-classification/run_xnli.py>` scripts.
Here is an example of using pipelines to do sentiment analysis: identifying if a sequence is positive or negative. It
leverages a fine-tuned model on sst2, which is a GLUE task.
@@ -168,8 +168,10 @@ Extractive Question Answering
Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune a
model on a SQuAD task, you may leverage the `run_qa.py
<https://github.com/huggingface/transformers/tree/master/examples/question-answering/run_qa.py>`__ and `run_tf_squad.py
<https://github.com/huggingface/transformers/tree/master/examples/question-answering/run_tf_squad.py>`__ scripts.
<https://github.com/huggingface/transformers/tree/master/examples/pytorch/question-answering/run_qa.py>`__ and
`run_tf_squad.py
<https://github.com/huggingface/transformers/tree/master/examples/tensorflow/question-answering/run_tf_squad.py>`__
scripts.
Here is an example of using pipelines to do question answering: extracting an answer from a text given a question. It
@@ -184,7 +186,7 @@ leverages a fine-tuned model on SQuAD.
>>> context = r"""
... Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
... question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
... a model on a SQuAD task, you may leverage the examples/question-answering/run_squad.py script.
... a model on a SQuAD task, you may leverage the examples/pytorch/question-answering/run_squad.py script.
... """
This returns an answer extracted from the text, a confidence score, alongside "start" and "end" values, which are the
@@ -325,8 +327,7 @@ fill that mask with an appropriate token. This allows the model to attend to bot
right of the mask) and the left context (tokens on the left of the mask). Such a training creates a strong basis for
downstream tasks requiring bi-directional context, such as SQuAD (question answering, see `Lewis, Lui, Goyal et al.
<https://arxiv.org/abs/1910.13461>`__, part 4.2). If you would like to fine-tune a model on a masked language modeling
task, you may leverage the `run_mlm.py
<https://github.com/huggingface/transformers/tree/master/examples/language-modeling/run_mlm.py>`__ script.
task, you may leverage the :prefix_link:`run_mlm.py <examples/pytorch/language-modeling/run_mlm.py>` script.
Here is an example of using pipelines to replace a mask from a sequence:
@@ -435,7 +436,7 @@ Causal Language Modeling
Causal language modeling is the task of predicting the token following a sequence of tokens. In this situation, the
model only attends to the left context (tokens on the left of the mask). Such a training is particularly interesting
for generation tasks. If you would like to fine-tune a model on a causal language modeling task, you may leverage the
`run_clm.py <https://github.com/huggingface/transformers/tree/master/examples/language-modeling/run_clm.py>`__ script.
:prefix_link:`run_clm.py <examples/pytorch/language-modeling/run_clm.py>` script.
Usually, the next token is predicted by sampling from the logits of the last hidden state the model produces from the
input sequence.
@@ -602,8 +603,7 @@ Named Entity Recognition
Named Entity Recognition (NER) is the task of classifying tokens according to a class, for example, identifying a token
as a person, an organisation or a location. An example of a named entity recognition dataset is the CoNLL-2003 dataset,
which is entirely based on that task. If you would like to fine-tune a model on an NER task, you may leverage the
`run_ner.py <https://github.com/huggingface/transformers/tree/master/examples/token-classification/run_ner.py>`__
script.
:prefix_link:`run_ner.py <examples/pytorch/token-classification/run_ner.py>` script.
Here is an example of using pipelines to do named entity recognition, specifically, trying to identify tokens as
belonging to one of 9 classes:
@@ -743,11 +743,12 @@ Summarization
Summarization is the task of summarizing a document or an article into a shorter text. If you would like to fine-tune a
model on a summarization task, you may leverage the `run_summarization.py
<https://github.com/huggingface/transformers/tree/master/examples/seq2seq/run_summarization.py>`__ script.
<https://github.com/huggingface/transformers/tree/master/examples/pytorch/summarization/run_summarization.py>`__
script.
An example of a summarization dataset is the CNN / Daily Mail dataset, which consists of long news articles and was
created for the task of summarization. If you would like to fine-tune a model on a summarization task, various
approaches are described in this :prefix_link:`document <examples/seq2seq/README.md>`.
approaches are described in this :prefix_link:`document <examples/pytorch/summarization/README.md>`.
Here is an example of using the pipelines to do summarization. It leverages a Bart model that was fine-tuned on the CNN
/ Daily Mail data set.
@@ -794,7 +795,7 @@ Here is an example of doing summarization using a model and a tokenizer. The pro
3. Add the T5 specific prefix "summarize: ".
4. Use the ``PreTrainedModel.generate()`` method to generate the summary.
In this example we use Google`s T5 model. Even though it was pre-trained only on a multi-task mixed dataset (including
In this example we use Google's T5 model. Even though it was pre-trained only on a multi-task mixed dataset (including
CNN / Daily Mail), it yields very good results.
.. code-block::
@@ -823,11 +824,12 @@ Translation
Translation is the task of translating a text from one language to another. If you would like to fine-tune a model on a
translation task, you may leverage the `run_translation.py
<https://github.com/huggingface/transformers/tree/master/examples/seq2seq/run_translation.py>`__ script.
<https://github.com/huggingface/transformers/tree/master/examples/pytorch/translation/run_translation.py>`__ script.
An example of a translation dataset is the WMT English to German dataset, which has sentences in English as the input
data and the corresponding sentences in German as the target data. If you would like to fine-tune a model on a
translation task, various approaches are described in this :prefix_link:`document <examples/seq2seq/README.md>`.
translation task, various approaches are described in this :prefix_link:`document
<examples/pytorch.translation/README.md>`.
Here is an example of using the pipelines to do translation. It leverages a T5 model that was only pre-trained on a
multi-task mixture dataset (including WMT), yet, yielding impressive translation results.