Examples reorg (#11350)
* Base move * Examples reorganization * Update references * Put back test data * Move conftest * More fixes * Move test data to test fixtures * Update path * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments and clean Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
This commit is contained in:
@@ -65,10 +65,10 @@ respectively.
|
||||
.. code-block:: bash
|
||||
|
||||
## PYTORCH CODE
|
||||
python examples/benchmarking/run_benchmark.py --help
|
||||
python examples/pytorch/benchmarking/run_benchmark.py --help
|
||||
|
||||
## TENSORFLOW CODE
|
||||
python examples/benchmarking/run_benchmark_tf.py --help
|
||||
python examples/tensorflow/benchmarking/run_benchmark_tf.py --help
|
||||
|
||||
|
||||
An instantiated benchmark object can then simply be run by calling ``benchmark.run()``.
|
||||
|
||||
@@ -33,8 +33,8 @@ You can convert any TensorFlow checkpoint for BERT (in particular `the pre-train
|
||||
This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated
|
||||
configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights
|
||||
from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that
|
||||
can be imported using ``from_pretrained()`` (see example in :doc:`quicktour` , `run_glue.py
|
||||
<https://github.com/huggingface/transformers/blob/master/examples/text-classification/run_glue.py>`_\ ).
|
||||
can be imported using ``from_pretrained()`` (see example in :doc:`quicktour` , :prefix_link:`run_glue.py
|
||||
<examples/pytorch/text-classification/run_glue.py>` \ ).
|
||||
|
||||
You only need to run this conversion script **once** to get a PyTorch model. You can then disregard the TensorFlow
|
||||
checkpoint (the three files starting with ``bert_model.ckpt``\ ) but be sure to keep the configuration file (\
|
||||
|
||||
@@ -168,13 +168,13 @@ Here is an example of how this can be used on a filesystem that is shared betwee
|
||||
On the instance with the normal network run your program which will download and cache models (and optionally datasets if you use 🤗 Datasets). For example:
|
||||
|
||||
```
|
||||
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
|
||||
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
|
||||
```
|
||||
|
||||
and then with the same filesystem you can now run the same program on a firewalled instance:
|
||||
```
|
||||
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
|
||||
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
|
||||
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
|
||||
```
|
||||
and it should succeed without any hanging waiting to timeout.
|
||||
|
||||
|
||||
@@ -68,8 +68,8 @@ Additionally, the following method can be used to load values from a data file a
|
||||
Example usage
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
An example using these processors is given in the `run_glue.py
|
||||
<https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_glue.py>`__ script.
|
||||
An example using these processors is given in the :prefix_link:`run_glue.py
|
||||
<examples/legacy/text-classification/run_glue.py>` script.
|
||||
|
||||
|
||||
XNLI
|
||||
@@ -89,8 +89,8 @@ This library hosts the processor to load the XNLI data:
|
||||
|
||||
Please note that since the gold labels are available on the test set, evaluation is performed on the test set.
|
||||
|
||||
An example using these processors is given in the `run_xnli.py
|
||||
<https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_xnli.py>`__ script.
|
||||
An example using these processors is given in the :prefix_link:`run_xnli.py
|
||||
<examples/legacy/text-classification/run_xnli.py>` script.
|
||||
|
||||
|
||||
SQuAD
|
||||
@@ -169,4 +169,4 @@ Using `tensorflow_datasets` is as easy as using a data file:
|
||||
|
||||
|
||||
Another example using these processors is given in the :prefix_link:`run_squad.py
|
||||
<examples/question-answering/run_squad.py>` script.
|
||||
<examples/legacy/question-answering/run_squad.py>` script.
|
||||
|
||||
@@ -338,7 +338,7 @@ For example here is how you could use it for ``run_translation.py`` with 2 GPUs:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python -m torch.distributed.launch --nproc_per_node=2 examples/seq2seq/run_translation.py \
|
||||
python -m torch.distributed.launch --nproc_per_node=2 examples/pytorch/translation/run_translation.py \
|
||||
--model_name_or_path t5-small --per_device_train_batch_size 1 \
|
||||
--output_dir output_dir --overwrite_output_dir \
|
||||
--do_train --max_train_samples 500 --num_train_epochs 1 \
|
||||
@@ -363,7 +363,7 @@ For example here is how you could use it for ``run_translation.py`` with 2 GPUs:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python -m torch.distributed.launch --nproc_per_node=2 examples/seq2seq/run_translation.py \
|
||||
python -m torch.distributed.launch --nproc_per_node=2 examples/pytorch/translation/run_translation.py \
|
||||
--model_name_or_path t5-small --per_device_train_batch_size 1 \
|
||||
--output_dir output_dir --overwrite_output_dir \
|
||||
--do_train --max_train_samples 500 --num_train_epochs 1 \
|
||||
@@ -540,7 +540,7 @@ Here is an example of running ``run_translation.py`` under DeepSpeed deploying a
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
deepspeed examples/seq2seq/run_translation.py \
|
||||
deepspeed examples/pytorch/translation/run_translation.py \
|
||||
--deepspeed tests/deepspeed/ds_config.json \
|
||||
--model_name_or_path t5-small --per_device_train_batch_size 1 \
|
||||
--output_dir output_dir --overwrite_output_dir --fp16 \
|
||||
@@ -565,7 +565,7 @@ To deploy DeepSpeed with one GPU adjust the :class:`~transformers.Trainer` comma
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
deepspeed --num_gpus=1 examples/seq2seq/run_translation.py \
|
||||
deepspeed --num_gpus=1 examples/pytorch/translation/run_translation.py \
|
||||
--deepspeed tests/deepspeed/ds_config.json \
|
||||
--model_name_or_path t5-small --per_device_train_batch_size 1 \
|
||||
--output_dir output_dir --overwrite_output_dir --fp16 \
|
||||
@@ -617,7 +617,7 @@ Notes:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
deepspeed --include localhost:1 examples/seq2seq/run_translation.py ...
|
||||
deepspeed --include localhost:1 examples/pytorch/translation/run_translation.py ...
|
||||
|
||||
In this example, we tell DeepSpeed to use GPU 1 (second gpu).
|
||||
|
||||
@@ -711,7 +711,7 @@ shell from a cell. For example, to use ``run_translation.py`` you would launch i
|
||||
.. code-block::
|
||||
|
||||
!git clone https://github.com/huggingface/transformers
|
||||
!cd transformers; deepspeed examples/seq2seq/run_translation.py ...
|
||||
!cd transformers; deepspeed examples/pytorch/translation/run_translation.py ...
|
||||
|
||||
or with ``%%bash`` magic, where you can write a multi-line code for the shell program to run:
|
||||
|
||||
@@ -721,7 +721,7 @@ or with ``%%bash`` magic, where you can write a multi-line code for the shell pr
|
||||
|
||||
git clone https://github.com/huggingface/transformers
|
||||
cd transformers
|
||||
deepspeed examples/seq2seq/run_translation.py ...
|
||||
deepspeed examples/pytorch/translation/run_translation.py ...
|
||||
|
||||
In such case you don't need any of the code presented at the beginning of this section.
|
||||
|
||||
|
||||
@@ -43,7 +43,7 @@ Examples
|
||||
_______________________________________________________________________________________________________________________
|
||||
|
||||
- Examples and scripts for fine-tuning BART and other models for sequence to sequence tasks can be found in
|
||||
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
|
||||
:prefix_link:`examples/pytorch/summarization/ <examples/pytorch/summarization/README.md>`.
|
||||
- An example of how to train :class:`~transformers.BartForConditionalGeneration` with a Hugging Face :obj:`datasets`
|
||||
object can be found in this `forum discussion
|
||||
<https://discuss.huggingface.co/t/train-bart-for-conditional-generation-e-g-summarization/1904>`__.
|
||||
|
||||
@@ -43,7 +43,7 @@ Examples
|
||||
_______________________________________________________________________________________________________________________
|
||||
|
||||
- BARThez can be fine-tuned on sequence-to-sequence tasks in a similar way as BART, check:
|
||||
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
|
||||
:prefix_link:`examples/pytorch/summarization/ <examples/pytorch/summarization/README.md>`.
|
||||
|
||||
|
||||
BarthezTokenizer
|
||||
|
||||
@@ -44,8 +44,8 @@ Tips:
|
||||
- DistilBERT doesn't have options to select the input positions (:obj:`position_ids` input). This could be added if
|
||||
necessary though, just let us know if you need this option.
|
||||
|
||||
This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found `here
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
|
||||
This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found
|
||||
:prefix_link:`here <examples/research-projects/distillation>`.
|
||||
|
||||
|
||||
DistilBertConfig
|
||||
|
||||
@@ -53,7 +53,8 @@ Examples
|
||||
_______________________________________________________________________________________________________________________
|
||||
|
||||
- :prefix_link:`Script <examples/research_projects/seq2seq-distillation/finetune_pegasus_xsum.sh>` to fine-tune pegasus
|
||||
on the XSUM dataset. Data download instructions at :prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
|
||||
on the XSUM dataset. Data download instructions at :prefix_link:`examples/pytorch/summarization/
|
||||
<examples/pytorch/summarization/README.md>`.
|
||||
- FP16 is not supported (help/ideas on this appreciated!).
|
||||
- The adafactor optimizer is recommended for pegasus fine-tuning.
|
||||
|
||||
|
||||
@@ -21,7 +21,7 @@ Question Answering <https://yjernite.github.io/lfqa.html>`__. RetriBERT is a sma
|
||||
pair of BERT encoders with lower-dimension projection for dense semantic indexing of text.
|
||||
|
||||
This model was contributed by `yjernite <https://huggingface.co/yjernite>`__. Code to train and use the model can be
|
||||
found `here <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
|
||||
found :prefix_link:`here <examples/research-projects/distillation>`.
|
||||
|
||||
|
||||
RetriBertConfig
|
||||
|
||||
@@ -41,7 +41,7 @@ Tips:
|
||||
using only a sub-set of the output tokens as target which are selected with the :obj:`target_mapping` input.
|
||||
- To use XLNet for sequential decoding (i.e. not in fully bi-directional setting), use the :obj:`perm_mask` and
|
||||
:obj:`target_mapping` inputs to control the attention span and outputs (see examples in
|
||||
`examples/text-generation/run_generation.py`)
|
||||
`examples/pytorch/text-generation/run_generation.py`)
|
||||
- XLNet is one of the few models that has no sequence length limit.
|
||||
|
||||
This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
|
||||
|
||||
@@ -682,7 +682,8 @@ The `mbart-large-en-ro checkpoint <https://huggingface.co/facebook/mbart-large-e
|
||||
romanian translation.
|
||||
|
||||
The `mbart-large-cc25 <https://huggingface.co/facebook/mbart-large-cc25>`_ checkpoint can be finetuned for other
|
||||
translation and summarization tasks, using code in ```examples/seq2seq/``` , but is not very useful without finetuning.
|
||||
translation and summarization tasks, using code in ```examples/pytorch/translation/``` , but is not very useful without
|
||||
finetuning.
|
||||
|
||||
|
||||
ProphetNet
|
||||
|
||||
@@ -90,8 +90,8 @@ You can then feed it all as input to your model:
|
||||
>>> outputs = model(input_ids, langs=langs)
|
||||
|
||||
|
||||
The example :prefix_link:`run_generation.py <examples/text-generation/run_generation.py>` can generate text using the
|
||||
CLM checkpoints from XLM, using the language embeddings.
|
||||
The example :prefix_link:`run_generation.py <examples/pytorch/text-generation/run_generation.py>` can generate text
|
||||
using the CLM checkpoints from XLM, using the language embeddings.
|
||||
|
||||
XLM without Language Embeddings
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
@@ -325,7 +325,7 @@ When you create a `HuggingFace` Estimator, you can specify a [training script th
|
||||
|
||||
If you are using `git_config` to run the [🤗 Transformers examples scripts](https://github.com/huggingface/transformers/tree/master/examples) keep in mind that you need to configure the right `'branch'` for you `transformers_version`, e.g. if you use `transformers_version='4.4.2` you have to use `'branch':'v4.4.2'`.
|
||||
|
||||
As an example to use `git_config` with an [example script from the transformers repository](https://github.com/huggingface/transformers/tree/master/examples/text-classification).
|
||||
As an example to use `git_config` with an [example script from the transformers repository](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification).
|
||||
|
||||
_Tip: define `output_dir` as `/opt/ml/model` in the hyperparameter for the script to save your model to S3 after training._
|
||||
|
||||
@@ -338,7 +338,7 @@ git_config = {'repo': 'https://github.com/huggingface/transformers.git','branch'
|
||||
# create the Estimator
|
||||
huggingface_estimator = HuggingFace(
|
||||
entry_point='run_glue.py',
|
||||
source_dir='./examples/text-classification',
|
||||
source_dir='./examples/pytorch/text-classification',
|
||||
git_config=git_config,
|
||||
instance_type='ml.p3.2xlarge',
|
||||
instance_count=1,
|
||||
|
||||
@@ -55,10 +55,10 @@ Sequence Classification
|
||||
Sequence classification is the task of classifying sequences according to a given number of classes. An example of
|
||||
sequence classification is the GLUE dataset, which is entirely based on that task. If you would like to fine-tune a
|
||||
model on a GLUE sequence classification task, you may leverage the :prefix_link:`run_glue.py
|
||||
<examples/text-classification/run_glue.py>`, :prefix_link:`run_tf_glue.py
|
||||
<examples/text-classification/run_tf_glue.py>`, :prefix_link:`run_tf_text_classification.py
|
||||
<examples/text-classification/run_tf_text_classification.py>` or :prefix_link:`run_xnli.py
|
||||
<examples/text-classification/run_xnli.py>` scripts.
|
||||
<examples/pytorch/text-classification/run_glue.py>`, :prefix_link:`run_tf_glue.py
|
||||
<examples/tensorflow/text-classification/run_tf_glue.py>`, :prefix_link:`run_tf_text_classification.py
|
||||
<examples/tensorflow/text-classification/run_tf_text_classification.py>` or :prefix_link:`run_xnli.py
|
||||
<examples/pytorch/text-classification/run_xnli.py>` scripts.
|
||||
|
||||
Here is an example of using pipelines to do sentiment analysis: identifying if a sequence is positive or negative. It
|
||||
leverages a fine-tuned model on sst2, which is a GLUE task.
|
||||
@@ -168,8 +168,10 @@ Extractive Question Answering
|
||||
Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
|
||||
question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune a
|
||||
model on a SQuAD task, you may leverage the `run_qa.py
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/question-answering/run_qa.py>`__ and `run_tf_squad.py
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/question-answering/run_tf_squad.py>`__ scripts.
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/pytorch/question-answering/run_qa.py>`__ and
|
||||
`run_tf_squad.py
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/tensorflow/question-answering/run_tf_squad.py>`__
|
||||
scripts.
|
||||
|
||||
|
||||
Here is an example of using pipelines to do question answering: extracting an answer from a text given a question. It
|
||||
@@ -184,7 +186,7 @@ leverages a fine-tuned model on SQuAD.
|
||||
>>> context = r"""
|
||||
... Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
|
||||
... question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
|
||||
... a model on a SQuAD task, you may leverage the examples/question-answering/run_squad.py script.
|
||||
... a model on a SQuAD task, you may leverage the examples/pytorch/question-answering/run_squad.py script.
|
||||
... """
|
||||
|
||||
This returns an answer extracted from the text, a confidence score, alongside "start" and "end" values, which are the
|
||||
@@ -325,8 +327,7 @@ fill that mask with an appropriate token. This allows the model to attend to bot
|
||||
right of the mask) and the left context (tokens on the left of the mask). Such a training creates a strong basis for
|
||||
downstream tasks requiring bi-directional context, such as SQuAD (question answering, see `Lewis, Lui, Goyal et al.
|
||||
<https://arxiv.org/abs/1910.13461>`__, part 4.2). If you would like to fine-tune a model on a masked language modeling
|
||||
task, you may leverage the `run_mlm.py
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/language-modeling/run_mlm.py>`__ script.
|
||||
task, you may leverage the :prefix_link:`run_mlm.py <examples/pytorch/language-modeling/run_mlm.py>` script.
|
||||
|
||||
Here is an example of using pipelines to replace a mask from a sequence:
|
||||
|
||||
@@ -435,7 +436,7 @@ Causal Language Modeling
|
||||
Causal language modeling is the task of predicting the token following a sequence of tokens. In this situation, the
|
||||
model only attends to the left context (tokens on the left of the mask). Such a training is particularly interesting
|
||||
for generation tasks. If you would like to fine-tune a model on a causal language modeling task, you may leverage the
|
||||
`run_clm.py <https://github.com/huggingface/transformers/tree/master/examples/language-modeling/run_clm.py>`__ script.
|
||||
:prefix_link:`run_clm.py <examples/pytorch/language-modeling/run_clm.py>` script.
|
||||
|
||||
Usually, the next token is predicted by sampling from the logits of the last hidden state the model produces from the
|
||||
input sequence.
|
||||
@@ -602,8 +603,7 @@ Named Entity Recognition
|
||||
Named Entity Recognition (NER) is the task of classifying tokens according to a class, for example, identifying a token
|
||||
as a person, an organisation or a location. An example of a named entity recognition dataset is the CoNLL-2003 dataset,
|
||||
which is entirely based on that task. If you would like to fine-tune a model on an NER task, you may leverage the
|
||||
`run_ner.py <https://github.com/huggingface/transformers/tree/master/examples/token-classification/run_ner.py>`__
|
||||
script.
|
||||
:prefix_link:`run_ner.py <examples/pytorch/token-classification/run_ner.py>` script.
|
||||
|
||||
Here is an example of using pipelines to do named entity recognition, specifically, trying to identify tokens as
|
||||
belonging to one of 9 classes:
|
||||
@@ -743,11 +743,12 @@ Summarization
|
||||
|
||||
Summarization is the task of summarizing a document or an article into a shorter text. If you would like to fine-tune a
|
||||
model on a summarization task, you may leverage the `run_summarization.py
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/seq2seq/run_summarization.py>`__ script.
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/pytorch/summarization/run_summarization.py>`__
|
||||
script.
|
||||
|
||||
An example of a summarization dataset is the CNN / Daily Mail dataset, which consists of long news articles and was
|
||||
created for the task of summarization. If you would like to fine-tune a model on a summarization task, various
|
||||
approaches are described in this :prefix_link:`document <examples/seq2seq/README.md>`.
|
||||
approaches are described in this :prefix_link:`document <examples/pytorch/summarization/README.md>`.
|
||||
|
||||
Here is an example of using the pipelines to do summarization. It leverages a Bart model that was fine-tuned on the CNN
|
||||
/ Daily Mail data set.
|
||||
@@ -794,7 +795,7 @@ Here is an example of doing summarization using a model and a tokenizer. The pro
|
||||
3. Add the T5 specific prefix "summarize: ".
|
||||
4. Use the ``PreTrainedModel.generate()`` method to generate the summary.
|
||||
|
||||
In this example we use Google`s T5 model. Even though it was pre-trained only on a multi-task mixed dataset (including
|
||||
In this example we use Google's T5 model. Even though it was pre-trained only on a multi-task mixed dataset (including
|
||||
CNN / Daily Mail), it yields very good results.
|
||||
|
||||
.. code-block::
|
||||
@@ -823,11 +824,12 @@ Translation
|
||||
|
||||
Translation is the task of translating a text from one language to another. If you would like to fine-tune a model on a
|
||||
translation task, you may leverage the `run_translation.py
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/seq2seq/run_translation.py>`__ script.
|
||||
<https://github.com/huggingface/transformers/tree/master/examples/pytorch/translation/run_translation.py>`__ script.
|
||||
|
||||
An example of a translation dataset is the WMT English to German dataset, which has sentences in English as the input
|
||||
data and the corresponding sentences in German as the target data. If you would like to fine-tune a model on a
|
||||
translation task, various approaches are described in this :prefix_link:`document <examples/seq2seq/README.md>`.
|
||||
translation task, various approaches are described in this :prefix_link:`document
|
||||
<examples/pytorch.translation/README.md>`.
|
||||
|
||||
Here is an example of using the pipelines to do translation. It leverages a T5 model that was only pre-trained on a
|
||||
multi-task mixture dataset (including WMT), yet, yielding impressive translation results.
|
||||
|
||||
Reference in New Issue
Block a user