Convert rst files (#14888)
* Convert all tutorials and guides * Convert all remaining rst to mdx * Track and fix bad links
This commit is contained in:
106
docs/source/main_classes/callback.mdx
Normal file
106
docs/source/main_classes/callback.mdx
Normal file
@@ -0,0 +1,106 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Callbacks
|
||||
|
||||
Callbacks are objects that can customize the behavior of the training loop in the PyTorch
|
||||
[`Trainer`] (this feature is not yet implemented in TensorFlow) that can inspect the training loop
|
||||
state (for progress reporting, logging on TensorBoard or other ML platforms...) and take decisions (like early
|
||||
stopping).
|
||||
|
||||
Callbacks are "read only" pieces of code, apart from the [`TrainerControl`] object they return, they
|
||||
cannot change anything in the training loop. For customizations that require changes in the training loop, you should
|
||||
subclass [`Trainer`] and override the methods you need (see [trainer](trainer) for examples).
|
||||
|
||||
By default a [`Trainer`] will use the following callbacks:
|
||||
|
||||
- [`DefaultFlowCallback`] which handles the default behavior for logging, saving and evaluation.
|
||||
- [`PrinterCallback`] or [`ProgressCallback`] to display progress and print the
|
||||
logs (the first one is used if you deactivate tqdm through the [`TrainingArguments`], otherwise
|
||||
it's the second one).
|
||||
- [`~integrations.TensorBoardCallback`] if tensorboard is accessible (either through PyTorch >= 1.4
|
||||
or tensorboardX).
|
||||
- [`~integrations.WandbCallback`] if [wandb](https://www.wandb.com/) is installed.
|
||||
- [`~integrations.CometCallback`] if [comet_ml](https://www.comet.ml/site/) is installed.
|
||||
- [`~integrations.MLflowCallback`] if [mlflow](https://www.mlflow.org/) is installed.
|
||||
- [`~integrations.AzureMLCallback`] if [azureml-sdk](https://pypi.org/project/azureml-sdk/) is
|
||||
installed.
|
||||
|
||||
The main class that implements callbacks is [`TrainerCallback`]. It gets the
|
||||
[`TrainingArguments`] used to instantiate the [`Trainer`], can access that
|
||||
Trainer's internal state via [`TrainerState`], and can take some actions on the training loop via
|
||||
[`TrainerControl`].
|
||||
|
||||
|
||||
## Available Callbacks
|
||||
|
||||
Here is the list of the available [`TrainerCallback`] in the library:
|
||||
|
||||
[[autodoc]] integrations.CometCallback
|
||||
- setup
|
||||
|
||||
[[autodoc]] DefaultFlowCallback
|
||||
|
||||
[[autodoc]] PrinterCallback
|
||||
|
||||
[[autodoc]] ProgressCallback
|
||||
|
||||
[[autodoc]] EarlyStoppingCallback
|
||||
|
||||
[[autodoc]] integrations.TensorBoardCallback
|
||||
|
||||
[[autodoc]] integrations.WandbCallback
|
||||
- setup
|
||||
|
||||
[[autodoc]] integrations.MLflowCallback
|
||||
- setup
|
||||
|
||||
[[autodoc]] integrations.AzureMLCallback
|
||||
|
||||
## TrainerCallback
|
||||
|
||||
[[autodoc]] TrainerCallback
|
||||
|
||||
Here is an example of how to register a custom callback with the PyTorch [`Trainer`]:
|
||||
|
||||
```python
|
||||
class MyCallback(TrainerCallback):
|
||||
"A callback that prints a message at the beginning of training"
|
||||
|
||||
def on_train_begin(self, args, state, control, **kwargs):
|
||||
print("Starting training")
|
||||
|
||||
trainer = Trainer(
|
||||
model,
|
||||
args,
|
||||
train_dataset=train_dataset,
|
||||
eval_dataset=eval_dataset,
|
||||
callbacks=[MyCallback] # We can either pass the callback class this way or an instance of it (MyCallback())
|
||||
)
|
||||
```
|
||||
|
||||
Another way to register a callback is to call `trainer.add_callback()` as follows:
|
||||
|
||||
```python
|
||||
trainer = Trainer(...)
|
||||
trainer.add_callback(MyCallback)
|
||||
# Alternatively, we can pass an instance of the callback class
|
||||
trainer.add_callback(MyCallback())
|
||||
```
|
||||
|
||||
## TrainerState
|
||||
|
||||
[[autodoc]] TrainerState
|
||||
|
||||
## TrainerControl
|
||||
|
||||
[[autodoc]] TrainerControl
|
||||
@@ -1,115 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Callbacks
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
Callbacks are objects that can customize the behavior of the training loop in the PyTorch
|
||||
:class:`~transformers.Trainer` (this feature is not yet implemented in TensorFlow) that can inspect the training loop
|
||||
state (for progress reporting, logging on TensorBoard or other ML platforms...) and take decisions (like early
|
||||
stopping).
|
||||
|
||||
Callbacks are "read only" pieces of code, apart from the :class:`~transformers.TrainerControl` object they return, they
|
||||
cannot change anything in the training loop. For customizations that require changes in the training loop, you should
|
||||
subclass :class:`~transformers.Trainer` and override the methods you need (see :doc:`trainer` for examples).
|
||||
|
||||
By default a :class:`~transformers.Trainer` will use the following callbacks:
|
||||
|
||||
- :class:`~transformers.DefaultFlowCallback` which handles the default behavior for logging, saving and evaluation.
|
||||
- :class:`~transformers.PrinterCallback` or :class:`~transformers.ProgressCallback` to display progress and print the
|
||||
logs (the first one is used if you deactivate tqdm through the :class:`~transformers.TrainingArguments`, otherwise
|
||||
it's the second one).
|
||||
- :class:`~transformers.integrations.TensorBoardCallback` if tensorboard is accessible (either through PyTorch >= 1.4
|
||||
or tensorboardX).
|
||||
- :class:`~transformers.integrations.WandbCallback` if `wandb <https://www.wandb.com/>`__ is installed.
|
||||
- :class:`~transformers.integrations.CometCallback` if `comet_ml <https://www.comet.ml/site/>`__ is installed.
|
||||
- :class:`~transformers.integrations.MLflowCallback` if `mlflow <https://www.mlflow.org/>`__ is installed.
|
||||
- :class:`~transformers.integrations.AzureMLCallback` if `azureml-sdk <https://pypi.org/project/azureml-sdk/>`__ is
|
||||
installed.
|
||||
|
||||
The main class that implements callbacks is :class:`~transformers.TrainerCallback`. It gets the
|
||||
:class:`~transformers.TrainingArguments` used to instantiate the :class:`~transformers.Trainer`, can access that
|
||||
Trainer's internal state via :class:`~transformers.TrainerState`, and can take some actions on the training loop via
|
||||
:class:`~transformers.TrainerControl`.
|
||||
|
||||
|
||||
Available Callbacks
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Here is the list of the available :class:`~transformers.TrainerCallback` in the library:
|
||||
|
||||
.. autoclass:: transformers.integrations.CometCallback
|
||||
:members: setup
|
||||
|
||||
.. autoclass:: transformers.DefaultFlowCallback
|
||||
|
||||
.. autoclass:: transformers.PrinterCallback
|
||||
|
||||
.. autoclass:: transformers.ProgressCallback
|
||||
|
||||
.. autoclass:: transformers.EarlyStoppingCallback
|
||||
|
||||
.. autoclass:: transformers.integrations.TensorBoardCallback
|
||||
|
||||
.. autoclass:: transformers.integrations.WandbCallback
|
||||
:members: setup
|
||||
|
||||
.. autoclass:: transformers.integrations.MLflowCallback
|
||||
:members: setup
|
||||
|
||||
.. autoclass:: transformers.integrations.AzureMLCallback
|
||||
|
||||
TrainerCallback
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.TrainerCallback
|
||||
:members:
|
||||
|
||||
Here is an example of how to register a custom callback with the PyTorch :class:`~transformers.Trainer`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class MyCallback(TrainerCallback):
|
||||
"A callback that prints a message at the beginning of training"
|
||||
|
||||
def on_train_begin(self, args, state, control, **kwargs):
|
||||
print("Starting training")
|
||||
|
||||
trainer = Trainer(
|
||||
model,
|
||||
args,
|
||||
train_dataset=train_dataset,
|
||||
eval_dataset=eval_dataset,
|
||||
callbacks=[MyCallback] # We can either pass the callback class this way or an instance of it (MyCallback())
|
||||
)
|
||||
|
||||
Another way to register a callback is to call ``trainer.add_callback()`` as follows:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
trainer = Trainer(...)
|
||||
trainer.add_callback(MyCallback)
|
||||
# Alternatively, we can pass an instance of the callback class
|
||||
trainer.add_callback(MyCallback())
|
||||
|
||||
TrainerState
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.TrainerState
|
||||
:members:
|
||||
|
||||
|
||||
TrainerControl
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.TrainerControl
|
||||
:members:
|
||||
28
docs/source/main_classes/configuration.mdx
Normal file
28
docs/source/main_classes/configuration.mdx
Normal file
@@ -0,0 +1,28 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Configuration
|
||||
|
||||
The base class [`PretrainedConfig`] implements the common methods for loading/saving a configuration
|
||||
either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded
|
||||
from HuggingFace's AWS S3 repository).
|
||||
|
||||
Each derived config class implements model specific attributes. Common attributes present in all config classes are:
|
||||
`hidden_size`, `num_attention_heads`, and `num_hidden_layers`. Text models further implement:
|
||||
`vocab_size`.
|
||||
|
||||
|
||||
## PretrainedConfig
|
||||
|
||||
[[autodoc]] PretrainedConfig
|
||||
- push_to_hub
|
||||
- all
|
||||
@@ -1,31 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Configuration
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
The base class :class:`~transformers.PretrainedConfig` implements the common methods for loading/saving a configuration
|
||||
either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded
|
||||
from HuggingFace's AWS S3 repository).
|
||||
|
||||
Each derived config class implements model specific attributes. Common attributes present in all config classes are:
|
||||
:obj:`hidden_size`, :obj:`num_attention_heads`, and :obj:`num_hidden_layers`. Text models further implement:
|
||||
:obj:`vocab_size`.
|
||||
|
||||
|
||||
|
||||
PretrainedConfig
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.PretrainedConfig
|
||||
:special-members: push_to_hub
|
||||
:members:
|
||||
64
docs/source/main_classes/data_collator.mdx
Normal file
64
docs/source/main_classes/data_collator.mdx
Normal file
@@ -0,0 +1,64 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Data Collator
|
||||
|
||||
Data collators are objects that will form a batch by using a list of dataset elements as input. These elements are of
|
||||
the same type as the elements of `train_dataset` or `eval_dataset`.
|
||||
|
||||
To be able to build batches, data collators may apply some processing (like padding). Some of them (like
|
||||
[`DataCollatorForLanguageModeling`]) also apply some random data augmentation (like random masking)
|
||||
on the formed batch.
|
||||
|
||||
Examples of use can be found in the [example scripts](../examples) or [example notebooks](../notebooks).
|
||||
|
||||
|
||||
## Default data collator
|
||||
|
||||
[[autodoc]] data.data_collator.default_data_collator
|
||||
|
||||
## DefaultDataCollator
|
||||
|
||||
[[autodoc]] data.data_collator.DefaultDataCollator
|
||||
|
||||
## DataCollatorWithPadding
|
||||
|
||||
[[autodoc]] data.data_collator.DataCollatorWithPadding
|
||||
|
||||
## DataCollatorForTokenClassification
|
||||
|
||||
[[autodoc]] data.data_collator.DataCollatorForTokenClassification
|
||||
|
||||
## DataCollatorForSeq2Seq
|
||||
|
||||
[[autodoc]] data.data_collator.DataCollatorForSeq2Seq
|
||||
|
||||
## DataCollatorForLanguageModeling
|
||||
|
||||
[[autodoc]] data.data_collator.DataCollatorForLanguageModeling
|
||||
- numpy_mask_tokens
|
||||
- tf_mask_tokens
|
||||
- torch_mask_tokens
|
||||
|
||||
## DataCollatorForWholeWordMask
|
||||
|
||||
[[autodoc]] data.data_collator.DataCollatorForWholeWordMask
|
||||
- numpy_mask_tokens
|
||||
- tf_mask_tokens
|
||||
- torch_mask_tokens
|
||||
|
||||
## DataCollatorForPermutationLanguageModeling
|
||||
|
||||
[[autodoc]] data.data_collator.DataCollatorForPermutationLanguageModeling
|
||||
- numpy_mask_tokens
|
||||
- tf_mask_tokens
|
||||
- torch_mask_tokens
|
||||
@@ -1,78 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Data Collator
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
Data collators are objects that will form a batch by using a list of dataset elements as input. These elements are of
|
||||
the same type as the elements of :obj:`train_dataset` or :obj:`eval_dataset`.
|
||||
|
||||
To be able to build batches, data collators may apply some processing (like padding). Some of them (like
|
||||
:class:`~transformers.DataCollatorForLanguageModeling`) also apply some random data augmentation (like random masking)
|
||||
on the formed batch.
|
||||
|
||||
Examples of use can be found in the :doc:`example scripts <../examples>` or :doc:`example notebooks <../notebooks>`.
|
||||
|
||||
|
||||
Default data collator
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: transformers.data.data_collator.default_data_collator
|
||||
|
||||
|
||||
DefaultDataCollator
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.data.data_collator.DefaultDataCollator
|
||||
:members:
|
||||
|
||||
|
||||
DataCollatorWithPadding
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.data.data_collator.DataCollatorWithPadding
|
||||
:members:
|
||||
|
||||
|
||||
DataCollatorForTokenClassification
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.data.data_collator.DataCollatorForTokenClassification
|
||||
:members:
|
||||
|
||||
|
||||
DataCollatorForSeq2Seq
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.data.data_collator.DataCollatorForSeq2Seq
|
||||
:members:
|
||||
|
||||
|
||||
DataCollatorForLanguageModeling
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.data.data_collator.DataCollatorForLanguageModeling
|
||||
:members: numpy_mask_tokens, tf_mask_tokens, torch_mask_tokens
|
||||
|
||||
|
||||
DataCollatorForWholeWordMask
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.data.data_collator.DataCollatorForWholeWordMask
|
||||
:members: numpy_mask_tokens, tf_mask_tokens, torch_mask_tokens
|
||||
|
||||
|
||||
DataCollatorForPermutationLanguageModeling
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.data.data_collator.DataCollatorForPermutationLanguageModeling
|
||||
:members: numpy_mask_tokens, tf_mask_tokens, torch_mask_tokens
|
||||
38
docs/source/main_classes/feature_extractor.mdx
Normal file
38
docs/source/main_classes/feature_extractor.mdx
Normal file
@@ -0,0 +1,38 @@
|
||||
<!--Copyright 2021 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Feature Extractor
|
||||
|
||||
A feature extractor is in charge of preparing input features for a multi-modal model. This includes feature extraction
|
||||
from sequences, *e.g.*, pre-processing audio files to Log-Mel Spectrogram features, feature extraction from images
|
||||
*e.g.* cropping image image files, but also padding, normalization, and conversion to Numpy, PyTorch, and TensorFlow
|
||||
tensors.
|
||||
|
||||
|
||||
## FeatureExtractionMixin
|
||||
|
||||
[[autodoc]] feature_extraction_utils.FeatureExtractionMixin
|
||||
- from_pretrained
|
||||
- save_pretrained
|
||||
|
||||
## SequenceFeatureExtractor
|
||||
|
||||
[[autodoc]] SequenceFeatureExtractor
|
||||
- pad
|
||||
|
||||
## BatchFeature
|
||||
|
||||
[[autodoc]] BatchFeature
|
||||
|
||||
## ImageFeatureExtractionMixin
|
||||
|
||||
[[autodoc]] image_utils.ImageFeatureExtractionMixin
|
||||
@@ -1,48 +0,0 @@
|
||||
..
|
||||
Copyright 2021 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
|
||||
Feature Extractor
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
A feature extractor is in charge of preparing input features for a multi-modal model. This includes feature extraction
|
||||
from sequences, *e.g.*, pre-processing audio files to Log-Mel Spectrogram features, feature extraction from images
|
||||
*e.g.* cropping image image files, but also padding, normalization, and conversion to Numpy, PyTorch, and TensorFlow
|
||||
tensors.
|
||||
|
||||
|
||||
FeatureExtractionMixin
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.feature_extraction_utils.FeatureExtractionMixin
|
||||
:members: from_pretrained, save_pretrained
|
||||
|
||||
|
||||
SequenceFeatureExtractor
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.SequenceFeatureExtractor
|
||||
:members: pad
|
||||
|
||||
|
||||
BatchFeature
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.BatchFeature
|
||||
:members:
|
||||
|
||||
|
||||
ImageFeatureExtractionMixin
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.image_utils.ImageFeatureExtractionMixin
|
||||
:members:
|
||||
20
docs/source/main_classes/keras_callbacks.mdx
Normal file
20
docs/source/main_classes/keras_callbacks.mdx
Normal file
@@ -0,0 +1,20 @@
|
||||
<!--Copyright 2021 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Keras callbacks
|
||||
|
||||
When training a Transformers model with Keras, there are some library-specific callbacks available to automate common
|
||||
tasks:
|
||||
|
||||
## PushToHubCallback
|
||||
|
||||
[[autodoc]] keras_callbacks.PushToHubCallback
|
||||
@@ -1,22 +0,0 @@
|
||||
..
|
||||
Copyright 2021 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Keras callbacks
|
||||
=======================================================================================================================
|
||||
|
||||
When training a Transformers model with Keras, there are some library-specific callbacks available to automate common
|
||||
tasks:
|
||||
|
||||
PushToHubCallback
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
.. autoclass:: transformers.keras_callbacks.PushToHubCallback
|
||||
80
docs/source/main_classes/logging.mdx
Normal file
80
docs/source/main_classes/logging.mdx
Normal file
@@ -0,0 +1,80 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Logging
|
||||
|
||||
🤗 Transformers has a centralized logging system, so that you can setup the verbosity of the library easily.
|
||||
|
||||
Currently the default verbosity of the library is `WARNING`.
|
||||
|
||||
To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity
|
||||
to the INFO level.
|
||||
|
||||
```python
|
||||
import transformers
|
||||
transformers.logging.set_verbosity_info()
|
||||
```
|
||||
|
||||
You can also use the environment variable `TRANSFORMERS_VERBOSITY` to override the default verbosity. You can set it
|
||||
to one of the following: `debug`, `info`, `warning`, `error`, `critical`. For example:
|
||||
|
||||
```bash
|
||||
TRANSFORMERS_VERBOSITY=error ./myprogram.py
|
||||
```
|
||||
|
||||
Additionally, some `warnings` can be disabled by setting the environment variable
|
||||
`TRANSFORMERS_NO_ADVISORY_WARNINGS` to a true value, like *1*. This will disable any warning that is logged using
|
||||
[`logger.warning_advice`]. For example:
|
||||
|
||||
|
||||
```bash
|
||||
TRANSFORMERS_NO_ADVISORY_WARNINGS=1 ./myprogram.py
|
||||
```
|
||||
|
||||
All the methods of this logging module are documented below, the main ones are
|
||||
[`logging.get_verbosity`] to get the current level of verbosity in the logger and
|
||||
[`logging.set_verbosity`] to set the verbosity to the level of your choice. In order (from the least
|
||||
verbose to the most verbose), those levels (with their corresponding int values in parenthesis) are:
|
||||
|
||||
- `transformers.logging.CRITICAL` or `transformers.logging.FATAL` (int value, 50): only report the most
|
||||
critical errors.
|
||||
- `transformers.logging.ERROR` (int value, 40): only report errors.
|
||||
- `transformers.logging.WARNING` or `transformers.logging.WARN` (int value, 30): only reports error and
|
||||
warnings. This the default level used by the library.
|
||||
- `transformers.logging.INFO` (int value, 20): reports error, warnings and basic information.
|
||||
- `transformers.logging.DEBUG` (int value, 10): report all information.
|
||||
|
||||
## Base setters
|
||||
|
||||
[[autodoc]] logging.set_verbosity_error
|
||||
|
||||
[[autodoc]] logging.set_verbosity_warning
|
||||
|
||||
[[autodoc]] logging.set_verbosity_info
|
||||
|
||||
[[autodoc]] logging.set_verbosity_debug
|
||||
|
||||
## Other functions
|
||||
|
||||
[[autodoc]] logging.get_verbosity
|
||||
|
||||
[[autodoc]] logging.set_verbosity
|
||||
|
||||
[[autodoc]] logging.get_logger
|
||||
|
||||
[[autodoc]] logging.enable_default_handler
|
||||
|
||||
[[autodoc]] logging.disable_default_handler
|
||||
|
||||
[[autodoc]] logging.enable_explicit_format
|
||||
|
||||
[[autodoc]] logging.reset_format
|
||||
@@ -1,83 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Logging
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
🤗 Transformers has a centralized logging system, so that you can setup the verbosity of the library easily.
|
||||
|
||||
Currently the default verbosity of the library is ``WARNING``.
|
||||
|
||||
To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity
|
||||
to the INFO level.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import transformers
|
||||
transformers.logging.set_verbosity_info()
|
||||
|
||||
You can also use the environment variable ``TRANSFORMERS_VERBOSITY`` to override the default verbosity. You can set it
|
||||
to one of the following: ``debug``, ``info``, ``warning``, ``error``, ``critical``. For example:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
TRANSFORMERS_VERBOSITY=error ./myprogram.py
|
||||
|
||||
Additionally, some ``warnings`` can be disabled by setting the environment variable
|
||||
``TRANSFORMERS_NO_ADVISORY_WARNINGS`` to a true value, like `1`. This will disable any warning that is logged using
|
||||
:meth:`logger.warning_advice`. For example:
|
||||
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
TRANSFORMERS_NO_ADVISORY_WARNINGS=1 ./myprogram.py
|
||||
|
||||
All the methods of this logging module are documented below, the main ones are
|
||||
:func:`transformers.logging.get_verbosity` to get the current level of verbosity in the logger and
|
||||
:func:`transformers.logging.set_verbosity` to set the verbosity to the level of your choice. In order (from the least
|
||||
verbose to the most verbose), those levels (with their corresponding int values in parenthesis) are:
|
||||
|
||||
- :obj:`transformers.logging.CRITICAL` or :obj:`transformers.logging.FATAL` (int value, 50): only report the most
|
||||
critical errors.
|
||||
- :obj:`transformers.logging.ERROR` (int value, 40): only report errors.
|
||||
- :obj:`transformers.logging.WARNING` or :obj:`transformers.logging.WARN` (int value, 30): only reports error and
|
||||
warnings. This the default level used by the library.
|
||||
- :obj:`transformers.logging.INFO` (int value, 20): reports error, warnings and basic information.
|
||||
- :obj:`transformers.logging.DEBUG` (int value, 10): report all information.
|
||||
|
||||
Base setters
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: transformers.logging.set_verbosity_error
|
||||
|
||||
.. autofunction:: transformers.logging.set_verbosity_warning
|
||||
|
||||
.. autofunction:: transformers.logging.set_verbosity_info
|
||||
|
||||
.. autofunction:: transformers.logging.set_verbosity_debug
|
||||
|
||||
Other functions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: transformers.logging.get_verbosity
|
||||
|
||||
.. autofunction:: transformers.logging.set_verbosity
|
||||
|
||||
.. autofunction:: transformers.logging.get_logger
|
||||
|
||||
.. autofunction:: transformers.logging.enable_default_handler
|
||||
|
||||
.. autofunction:: transformers.logging.disable_default_handler
|
||||
|
||||
.. autofunction:: transformers.logging.enable_explicit_format
|
||||
|
||||
.. autofunction:: transformers.logging.reset_format
|
||||
99
docs/source/main_classes/model.mdx
Normal file
99
docs/source/main_classes/model.mdx
Normal file
@@ -0,0 +1,99 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Models
|
||||
|
||||
The base classes [`PreTrainedModel`], [`TFPreTrainedModel`], and
|
||||
[`FlaxPreTrainedModel`] implement the common methods for loading/saving a model either from a local
|
||||
file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace's AWS
|
||||
S3 repository).
|
||||
|
||||
[`PreTrainedModel`] and [`TFPreTrainedModel`] also implement a few methods which
|
||||
are common among all the models to:
|
||||
|
||||
- resize the input token embeddings when new tokens are added to the vocabulary
|
||||
- prune the attention heads of the model.
|
||||
|
||||
The other methods that are common to each model are defined in [`~modeling_utils.ModuleUtilsMixin`]
|
||||
(for the PyTorch models) and [`~modeling_tf_utils.TFModuleUtilsMixin`] (for the TensorFlow models) or
|
||||
for text generation, [`~generation_utils.GenerationMixin`] (for the PyTorch models),
|
||||
[`~generation_tf_utils.TFGenerationMixin`] (for the TensorFlow models) and
|
||||
[`~generation_flax_utils.FlaxGenerationMixin`] (for the Flax/JAX models).
|
||||
|
||||
|
||||
## PreTrainedModel
|
||||
|
||||
[[autodoc]] PreTrainedModel
|
||||
- push_to_hub
|
||||
- all
|
||||
|
||||
<a id='from_pretrained-torch-dtype'></a>
|
||||
|
||||
### Model Instantiation dtype
|
||||
|
||||
Under Pytorch a model normally gets instantiated with `torch.float32` format. This can be an issue if one tries to
|
||||
load a model whose weights are in fp16, since it'd require twice as much memory. To overcome this limitation, you can
|
||||
either explicitly pass the desired `dtype` using `torch_dtype` argument:
|
||||
|
||||
```python
|
||||
model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype=torch.float16)
|
||||
```
|
||||
|
||||
or, if you want the model to always load in the most optimal memory pattern, you can use the special value `"auto"`,
|
||||
and then `dtype` will be automatically derived from the model's weights:
|
||||
|
||||
```python
|
||||
model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype="auto")
|
||||
```
|
||||
|
||||
Models instantiated from scratch can also be told which `dtype` to use with:
|
||||
|
||||
```python
|
||||
config = T5Config.from_pretrained("t5")
|
||||
model = AutoModel.from_config(config)
|
||||
```
|
||||
|
||||
Due to Pytorch design, this functionality is only available for floating dtypes.
|
||||
|
||||
|
||||
|
||||
## ModuleUtilsMixin
|
||||
|
||||
[[autodoc]] modeling_utils.ModuleUtilsMixin
|
||||
|
||||
## TFPreTrainedModel
|
||||
|
||||
[[autodoc]] TFPreTrainedModel
|
||||
- push_to_hub
|
||||
- all
|
||||
|
||||
## TFModelUtilsMixin
|
||||
|
||||
[[autodoc]] modeling_tf_utils.TFModelUtilsMixin
|
||||
|
||||
## FlaxPreTrainedModel
|
||||
|
||||
[[autodoc]] FlaxPreTrainedModel
|
||||
- push_to_hub
|
||||
- all
|
||||
|
||||
## Generation
|
||||
|
||||
[[autodoc]] generation_utils.GenerationMixin
|
||||
|
||||
[[autodoc]] generation_tf_utils.TFGenerationMixin
|
||||
|
||||
[[autodoc]] generation_flax_utils.FlaxGenerationMixin
|
||||
|
||||
## Pushing to the Hub
|
||||
|
||||
[[autodoc]] file_utils.PushToHubMixin
|
||||
@@ -1,120 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Models
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
The base classes :class:`~transformers.PreTrainedModel`, :class:`~transformers.TFPreTrainedModel`, and
|
||||
:class:`~transformers.FlaxPreTrainedModel` implement the common methods for loading/saving a model either from a local
|
||||
file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace's AWS
|
||||
S3 repository).
|
||||
|
||||
:class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` also implement a few methods which
|
||||
are common among all the models to:
|
||||
|
||||
- resize the input token embeddings when new tokens are added to the vocabulary
|
||||
- prune the attention heads of the model.
|
||||
|
||||
The other methods that are common to each model are defined in :class:`~transformers.modeling_utils.ModuleUtilsMixin`
|
||||
(for the PyTorch models) and :class:`~transformers.modeling_tf_utils.TFModuleUtilsMixin` (for the TensorFlow models) or
|
||||
for text generation, :class:`~transformers.generation_utils.GenerationMixin` (for the PyTorch models),
|
||||
:class:`~transformers.generation_tf_utils.TFGenerationMixin` (for the TensorFlow models) and
|
||||
:class:`~transformers.generation_flax_utils.FlaxGenerationMixin` (for the Flax/JAX models).
|
||||
|
||||
|
||||
PreTrainedModel
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.PreTrainedModel
|
||||
:special-members: push_to_hub
|
||||
:members:
|
||||
|
||||
|
||||
.. _from_pretrained-torch-dtype:
|
||||
|
||||
Model Instantiation dtype
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Under Pytorch a model normally gets instantiated with ``torch.float32`` format. This can be an issue if one tries to
|
||||
load a model whose weights are in fp16, since it'd require twice as much memory. To overcome this limitation, you can
|
||||
either explicitly pass the desired ``dtype`` using ``torch_dtype`` argument:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype=torch.float16)
|
||||
|
||||
or, if you want the model to always load in the most optimal memory pattern, you can use the special value ``"auto"``,
|
||||
and then ``dtype`` will be automatically derived from the model's weights:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype="auto")
|
||||
|
||||
Models instantiated from scratch can also be told which ``dtype`` to use with:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
config = T5Config.from_pretrained("t5")
|
||||
model = AutoModel.from_config(config)
|
||||
|
||||
Due to Pytorch design, this functionality is only available for floating dtypes.
|
||||
|
||||
|
||||
|
||||
ModuleUtilsMixin
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_utils.ModuleUtilsMixin
|
||||
:members:
|
||||
|
||||
|
||||
TFPreTrainedModel
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.TFPreTrainedModel
|
||||
:special-members: push_to_hub
|
||||
:members:
|
||||
|
||||
|
||||
TFModelUtilsMixin
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_utils.TFModelUtilsMixin
|
||||
:members:
|
||||
|
||||
|
||||
FlaxPreTrainedModel
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.FlaxPreTrainedModel
|
||||
:special-members: push_to_hub
|
||||
:members:
|
||||
|
||||
|
||||
Generation
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.generation_utils.GenerationMixin
|
||||
:members:
|
||||
|
||||
.. autoclass:: transformers.generation_tf_utils.TFGenerationMixin
|
||||
:members:
|
||||
|
||||
.. autoclass:: transformers.generation_flax_utils.FlaxGenerationMixin
|
||||
:members:
|
||||
|
||||
|
||||
Pushing to the Hub
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.file_utils.PushToHubMixin
|
||||
:members:
|
||||
71
docs/source/main_classes/optimizer_schedules.mdx
Normal file
71
docs/source/main_classes/optimizer_schedules.mdx
Normal file
@@ -0,0 +1,71 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Optimization
|
||||
|
||||
The `.optimization` module provides:
|
||||
|
||||
- an optimizer with weight decay fixed that can be used to fine-tuned models, and
|
||||
- several schedules in the form of schedule objects that inherit from `_LRSchedule`:
|
||||
- a gradient accumulation class to accumulate the gradients of multiple batches
|
||||
|
||||
## AdamW (PyTorch)
|
||||
|
||||
[[autodoc]] AdamW
|
||||
|
||||
## AdaFactor (PyTorch)
|
||||
|
||||
[[autodoc]] Adafactor
|
||||
|
||||
## AdamWeightDecay (TensorFlow)
|
||||
|
||||
[[autodoc]] AdamWeightDecay
|
||||
|
||||
[[autodoc]] create_optimizer
|
||||
|
||||
## Schedules
|
||||
|
||||
### Learning Rate Schedules (Pytorch)
|
||||
|
||||
[[autodoc]] SchedulerType
|
||||
|
||||
[[autodoc]] get_scheduler
|
||||
|
||||
[[autodoc]] get_constant_schedule
|
||||
|
||||
[[autodoc]] get_constant_schedule_with_warmup
|
||||
|
||||
<img alt="" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_constant_schedule.png"/>
|
||||
|
||||
[[autodoc]] get_cosine_schedule_with_warmup
|
||||
|
||||
<img alt="" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_cosine_schedule.png"/>
|
||||
|
||||
[[autodoc]] get_cosine_with_hard_restarts_schedule_with_warmup
|
||||
|
||||
<img alt="" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_cosine_hard_restarts_schedule.png"/>
|
||||
|
||||
[[autodoc]] get_linear_schedule_with_warmup
|
||||
|
||||
<img alt="" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_linear_schedule.png"/>
|
||||
|
||||
[[autodoc]] get_polynomial_decay_schedule_with_warmup
|
||||
|
||||
### Warmup (TensorFlow)
|
||||
|
||||
[[autodoc]] WarmUp
|
||||
|
||||
## Gradient Strategies
|
||||
|
||||
### GradientAccumulator (TensorFlow)
|
||||
|
||||
[[autodoc]] GradientAccumulator
|
||||
@@ -1,97 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Optimization
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
The ``.optimization`` module provides:
|
||||
|
||||
- an optimizer with weight decay fixed that can be used to fine-tuned models, and
|
||||
- several schedules in the form of schedule objects that inherit from ``_LRSchedule``:
|
||||
- a gradient accumulation class to accumulate the gradients of multiple batches
|
||||
|
||||
AdamW (PyTorch)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.AdamW
|
||||
:members:
|
||||
|
||||
AdaFactor (PyTorch)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.Adafactor
|
||||
|
||||
AdamWeightDecay (TensorFlow)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.AdamWeightDecay
|
||||
|
||||
.. autofunction:: transformers.create_optimizer
|
||||
|
||||
Schedules
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Learning Rate Schedules (Pytorch)
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. autoclass:: transformers.SchedulerType
|
||||
|
||||
.. autofunction:: transformers.get_scheduler
|
||||
|
||||
.. autofunction:: transformers.get_constant_schedule
|
||||
|
||||
|
||||
.. autofunction:: transformers.get_constant_schedule_with_warmup
|
||||
|
||||
.. image:: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_constant_schedule.png
|
||||
:target: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_constant_schedule.png
|
||||
:alt:
|
||||
|
||||
|
||||
.. autofunction:: transformers.get_cosine_schedule_with_warmup
|
||||
|
||||
.. image:: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_cosine_schedule.png
|
||||
:target: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_cosine_schedule.png
|
||||
:alt:
|
||||
|
||||
|
||||
.. autofunction:: transformers.get_cosine_with_hard_restarts_schedule_with_warmup
|
||||
|
||||
.. image:: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_cosine_hard_restarts_schedule.png
|
||||
:target: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_cosine_hard_restarts_schedule.png
|
||||
:alt:
|
||||
|
||||
|
||||
|
||||
.. autofunction:: transformers.get_linear_schedule_with_warmup
|
||||
|
||||
.. image:: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_linear_schedule.png
|
||||
:target: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/warmup_linear_schedule.png
|
||||
:alt:
|
||||
|
||||
|
||||
.. autofunction:: transformers.get_polynomial_decay_schedule_with_warmup
|
||||
|
||||
|
||||
Warmup (TensorFlow)
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. autoclass:: transformers.WarmUp
|
||||
:members:
|
||||
|
||||
Gradient Strategies
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
GradientAccumulator (TensorFlow)
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. autoclass:: transformers.GradientAccumulator
|
||||
269
docs/source/main_classes/output.mdx
Normal file
269
docs/source/main_classes/output.mdx
Normal file
@@ -0,0 +1,269 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Model outputs
|
||||
|
||||
All models have outputs that are instances of subclasses of [`~file_utils.ModelOutput`]. Those are
|
||||
data structures containing all the information returned by the model, but that can also be used as tuples or
|
||||
dictionaries.
|
||||
|
||||
Let's see of this looks on an example:
|
||||
|
||||
```python
|
||||
from transformers import BertTokenizer, BertForSequenceClassification
|
||||
import torch
|
||||
|
||||
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
|
||||
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
|
||||
|
||||
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
|
||||
labels = torch.tensor([1]).unsqueeze(0) # Batch size 1
|
||||
outputs = model(**inputs, labels=labels)
|
||||
```
|
||||
|
||||
The `outputs` object is a [`~modeling_outputs.SequenceClassifierOutput`], as we can see in the
|
||||
documentation of that class below, it means it has an optional `loss`, a `logits` an optional `hidden_states` and
|
||||
an optional `attentions` attribute. Here we have the `loss` since we passed along `labels`, but we don't have
|
||||
`hidden_states` and `attentions` because we didn't pass `output_hidden_states=True` or
|
||||
`output_attentions=True`.
|
||||
|
||||
You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you
|
||||
will get `None`. Here for instance `outputs.loss` is the loss computed by the model, and `outputs.attentions` is
|
||||
`None`.
|
||||
|
||||
When considering our `outputs` object as tuple, it only considers the attributes that don't have `None` values.
|
||||
Here for instance, it has two elements, `loss` then `logits`, so
|
||||
|
||||
```python
|
||||
outputs[:2]
|
||||
```
|
||||
|
||||
will return the tuple `(outputs.loss, outputs.logits)` for instance.
|
||||
|
||||
When considering our `outputs` object as dictionary, it only considers the attributes that don't have `None`
|
||||
values. Here for instance, it has two keys that are `loss` and `logits`.
|
||||
|
||||
We document here the generic model outputs that are used by more than one model type. Specific output types are
|
||||
documented on their corresponding model page.
|
||||
|
||||
## ModelOutput
|
||||
|
||||
[[autodoc]] file_utils.ModelOutput
|
||||
- to_tuple
|
||||
|
||||
## BaseModelOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.BaseModelOutput
|
||||
|
||||
## BaseModelOutputWithPooling
|
||||
|
||||
[[autodoc]] modeling_outputs.BaseModelOutputWithPooling
|
||||
|
||||
## BaseModelOutputWithCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_outputs.BaseModelOutputWithCrossAttentions
|
||||
|
||||
## BaseModelOutputWithPoolingAndCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
|
||||
|
||||
## BaseModelOutputWithPast
|
||||
|
||||
[[autodoc]] modeling_outputs.BaseModelOutputWithPast
|
||||
|
||||
## BaseModelOutputWithPastAndCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_outputs.BaseModelOutputWithPastAndCrossAttentions
|
||||
|
||||
## Seq2SeqModelOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.Seq2SeqModelOutput
|
||||
|
||||
## CausalLMOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.CausalLMOutput
|
||||
|
||||
## CausalLMOutputWithCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_outputs.CausalLMOutputWithCrossAttentions
|
||||
|
||||
## CausalLMOutputWithPast
|
||||
|
||||
[[autodoc]] modeling_outputs.CausalLMOutputWithPast
|
||||
|
||||
## MaskedLMOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.MaskedLMOutput
|
||||
|
||||
## Seq2SeqLMOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.Seq2SeqLMOutput
|
||||
|
||||
## NextSentencePredictorOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.NextSentencePredictorOutput
|
||||
|
||||
## SequenceClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.SequenceClassifierOutput
|
||||
|
||||
## Seq2SeqSequenceClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.Seq2SeqSequenceClassifierOutput
|
||||
|
||||
## MultipleChoiceModelOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.MultipleChoiceModelOutput
|
||||
|
||||
## TokenClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.TokenClassifierOutput
|
||||
|
||||
## QuestionAnsweringModelOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.QuestionAnsweringModelOutput
|
||||
|
||||
## Seq2SeqQuestionAnsweringModelOutput
|
||||
|
||||
[[autodoc]] modeling_outputs.Seq2SeqQuestionAnsweringModelOutput
|
||||
|
||||
## TFBaseModelOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFBaseModelOutput
|
||||
|
||||
## TFBaseModelOutputWithPooling
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPooling
|
||||
|
||||
## TFBaseModelOutputWithPoolingAndCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPoolingAndCrossAttentions
|
||||
|
||||
## TFBaseModelOutputWithPast
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPast
|
||||
|
||||
## TFBaseModelOutputWithPastAndCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFBaseModelOutputWithPastAndCrossAttentions
|
||||
|
||||
## TFSeq2SeqModelOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFSeq2SeqModelOutput
|
||||
|
||||
## TFCausalLMOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFCausalLMOutput
|
||||
|
||||
## TFCausalLMOutputWithCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFCausalLMOutputWithCrossAttentions
|
||||
|
||||
## TFCausalLMOutputWithPast
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFCausalLMOutputWithPast
|
||||
|
||||
## TFMaskedLMOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFMaskedLMOutput
|
||||
|
||||
## TFSeq2SeqLMOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFSeq2SeqLMOutput
|
||||
|
||||
## TFNextSentencePredictorOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFNextSentencePredictorOutput
|
||||
|
||||
## TFSequenceClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFSequenceClassifierOutput
|
||||
|
||||
## TFSeq2SeqSequenceClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFSeq2SeqSequenceClassifierOutput
|
||||
|
||||
## TFMultipleChoiceModelOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFMultipleChoiceModelOutput
|
||||
|
||||
## TFTokenClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFTokenClassifierOutput
|
||||
|
||||
## TFQuestionAnsweringModelOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFQuestionAnsweringModelOutput
|
||||
|
||||
## TFSeq2SeqQuestionAnsweringModelOutput
|
||||
|
||||
[[autodoc]] modeling_tf_outputs.TFSeq2SeqQuestionAnsweringModelOutput
|
||||
|
||||
## FlaxBaseModelOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutput
|
||||
|
||||
## FlaxBaseModelOutputWithPast
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutputWithPast
|
||||
|
||||
## FlaxBaseModelOutputWithPooling
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutputWithPooling
|
||||
|
||||
## FlaxBaseModelOutputWithPastAndCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxBaseModelOutputWithPastAndCrossAttentions
|
||||
|
||||
## FlaxSeq2SeqModelOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqModelOutput
|
||||
|
||||
## FlaxCausalLMOutputWithCrossAttentions
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxCausalLMOutputWithCrossAttentions
|
||||
|
||||
## FlaxMaskedLMOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxMaskedLMOutput
|
||||
|
||||
## FlaxSeq2SeqLMOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqLMOutput
|
||||
|
||||
## FlaxNextSentencePredictorOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxNextSentencePredictorOutput
|
||||
|
||||
## FlaxSequenceClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxSequenceClassifierOutput
|
||||
|
||||
## FlaxSeq2SeqSequenceClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqSequenceClassifierOutput
|
||||
|
||||
## FlaxMultipleChoiceModelOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxMultipleChoiceModelOutput
|
||||
|
||||
## FlaxTokenClassifierOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxTokenClassifierOutput
|
||||
|
||||
## FlaxQuestionAnsweringModelOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxQuestionAnsweringModelOutput
|
||||
|
||||
## FlaxSeq2SeqQuestionAnsweringModelOutput
|
||||
|
||||
[[autodoc]] modeling_flax_outputs.FlaxSeq2SeqQuestionAnsweringModelOutput
|
||||
@@ -1,412 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Model outputs
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
All models have outputs that are instances of subclasses of :class:`~transformers.file_utils.ModelOutput`. Those are
|
||||
data structures containing all the information returned by the model, but that can also be used as tuples or
|
||||
dictionaries.
|
||||
|
||||
Let's see of this looks on an example:
|
||||
|
||||
.. code-block::
|
||||
|
||||
from transformers import BertTokenizer, BertForSequenceClassification
|
||||
import torch
|
||||
|
||||
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
|
||||
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
|
||||
|
||||
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
|
||||
labels = torch.tensor([1]).unsqueeze(0) # Batch size 1
|
||||
outputs = model(**inputs, labels=labels)
|
||||
|
||||
The ``outputs`` object is a :class:`~transformers.modeling_outputs.SequenceClassifierOutput`, as we can see in the
|
||||
documentation of that class below, it means it has an optional ``loss``, a ``logits`` an optional ``hidden_states`` and
|
||||
an optional ``attentions`` attribute. Here we have the ``loss`` since we passed along ``labels``, but we don't have
|
||||
``hidden_states`` and ``attentions`` because we didn't pass ``output_hidden_states=True`` or
|
||||
``output_attentions=True``.
|
||||
|
||||
You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you
|
||||
will get ``None``. Here for instance ``outputs.loss`` is the loss computed by the model, and ``outputs.attentions`` is
|
||||
``None``.
|
||||
|
||||
When considering our ``outputs`` object as tuple, it only considers the attributes that don't have ``None`` values.
|
||||
Here for instance, it has two elements, ``loss`` then ``logits``, so
|
||||
|
||||
.. code-block::
|
||||
|
||||
outputs[:2]
|
||||
|
||||
will return the tuple ``(outputs.loss, outputs.logits)`` for instance.
|
||||
|
||||
When considering our ``outputs`` object as dictionary, it only considers the attributes that don't have ``None``
|
||||
values. Here for instance, it has two keys that are ``loss`` and ``logits``.
|
||||
|
||||
We document here the generic model outputs that are used by more than one model type. Specific output types are
|
||||
documented on their corresponding model page.
|
||||
|
||||
ModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.file_utils.ModelOutput
|
||||
:members: to_tuple
|
||||
|
||||
|
||||
BaseModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.BaseModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
BaseModelOutputWithPooling
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPooling
|
||||
:members:
|
||||
|
||||
|
||||
BaseModelOutputWithCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.BaseModelOutputWithCrossAttentions
|
||||
:members:
|
||||
|
||||
|
||||
BaseModelOutputWithPoolingAndCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
|
||||
:members:
|
||||
|
||||
|
||||
BaseModelOutputWithPast
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPast
|
||||
:members:
|
||||
|
||||
|
||||
BaseModelOutputWithPastAndCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPastAndCrossAttentions
|
||||
:members:
|
||||
|
||||
|
||||
Seq2SeqModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.Seq2SeqModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
CausalLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.CausalLMOutput
|
||||
:members:
|
||||
|
||||
|
||||
CausalLMOutputWithCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.CausalLMOutputWithCrossAttentions
|
||||
:members:
|
||||
|
||||
|
||||
CausalLMOutputWithPast
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.CausalLMOutputWithPast
|
||||
:members:
|
||||
|
||||
|
||||
MaskedLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.MaskedLMOutput
|
||||
:members:
|
||||
|
||||
|
||||
Seq2SeqLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.Seq2SeqLMOutput
|
||||
:members:
|
||||
|
||||
|
||||
NextSentencePredictorOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.NextSentencePredictorOutput
|
||||
:members:
|
||||
|
||||
|
||||
SequenceClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.SequenceClassifierOutput
|
||||
:members:
|
||||
|
||||
|
||||
Seq2SeqSequenceClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.Seq2SeqSequenceClassifierOutput
|
||||
:members:
|
||||
|
||||
|
||||
MultipleChoiceModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.MultipleChoiceModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
TokenClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.TokenClassifierOutput
|
||||
:members:
|
||||
|
||||
|
||||
QuestionAnsweringModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.QuestionAnsweringModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
Seq2SeqQuestionAnsweringModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_outputs.Seq2SeqQuestionAnsweringModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFBaseModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFBaseModelOutputWithPooling
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutputWithPooling
|
||||
:members:
|
||||
|
||||
|
||||
TFBaseModelOutputWithPoolingAndCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutputWithPoolingAndCrossAttentions
|
||||
:members:
|
||||
|
||||
|
||||
TFBaseModelOutputWithPast
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutputWithPast
|
||||
:members:
|
||||
|
||||
|
||||
TFBaseModelOutputWithPastAndCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutputWithPastAndCrossAttentions
|
||||
:members:
|
||||
|
||||
|
||||
TFSeq2SeqModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFCausalLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFCausalLMOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFCausalLMOutputWithCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFCausalLMOutputWithCrossAttentions
|
||||
:members:
|
||||
|
||||
|
||||
TFCausalLMOutputWithPast
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFCausalLMOutputWithPast
|
||||
:members:
|
||||
|
||||
|
||||
TFMaskedLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFMaskedLMOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFSeq2SeqLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqLMOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFNextSentencePredictorOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFNextSentencePredictorOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFSequenceClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFSequenceClassifierOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFSeq2SeqSequenceClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqSequenceClassifierOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFMultipleChoiceModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFMultipleChoiceModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFTokenClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFTokenClassifierOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFQuestionAnsweringModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFQuestionAnsweringModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
TFSeq2SeqQuestionAnsweringModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqQuestionAnsweringModelOutput
|
||||
:members:
|
||||
|
||||
|
||||
FlaxBaseModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxBaseModelOutput
|
||||
|
||||
|
||||
FlaxBaseModelOutputWithPast
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxBaseModelOutputWithPast
|
||||
|
||||
|
||||
FlaxBaseModelOutputWithPooling
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxBaseModelOutputWithPooling
|
||||
|
||||
|
||||
FlaxBaseModelOutputWithPastAndCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxBaseModelOutputWithPastAndCrossAttentions
|
||||
|
||||
|
||||
FlaxSeq2SeqModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxSeq2SeqModelOutput
|
||||
|
||||
|
||||
FlaxCausalLMOutputWithCrossAttentions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxCausalLMOutputWithCrossAttentions
|
||||
|
||||
|
||||
FlaxMaskedLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxMaskedLMOutput
|
||||
|
||||
|
||||
FlaxSeq2SeqLMOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxSeq2SeqLMOutput
|
||||
|
||||
|
||||
FlaxNextSentencePredictorOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxNextSentencePredictorOutput
|
||||
|
||||
|
||||
FlaxSequenceClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxSequenceClassifierOutput
|
||||
|
||||
|
||||
FlaxSeq2SeqSequenceClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxSeq2SeqSequenceClassifierOutput
|
||||
|
||||
|
||||
FlaxMultipleChoiceModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxMultipleChoiceModelOutput
|
||||
|
||||
|
||||
FlaxTokenClassifierOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxTokenClassifierOutput
|
||||
|
||||
|
||||
FlaxQuestionAnsweringModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxQuestionAnsweringModelOutput
|
||||
|
||||
|
||||
FlaxSeq2SeqQuestionAnsweringModelOutput
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.modeling_flax_outputs.FlaxSeq2SeqQuestionAnsweringModelOutput
|
||||
375
docs/source/main_classes/pipelines.mdx
Normal file
375
docs/source/main_classes/pipelines.mdx
Normal file
@@ -0,0 +1,375 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Pipelines
|
||||
|
||||
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of
|
||||
the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity
|
||||
Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering. See the
|
||||
[task summary](../task_summary) for examples of use.
|
||||
|
||||
There are two categories of pipeline abstractions to be aware about:
|
||||
|
||||
- The [`pipeline`] which is the most powerful object encapsulating all other pipelines.
|
||||
- The other task-specific pipelines:
|
||||
|
||||
- [`AudioClassificationPipeline`]
|
||||
- [`AutomaticSpeechRecognitionPipeline`]
|
||||
- [`ConversationalPipeline`]
|
||||
- [`FeatureExtractionPipeline`]
|
||||
- [`FillMaskPipeline`]
|
||||
- [`ImageClassificationPipeline`]
|
||||
- [`ImageSegmentationPipeline`]
|
||||
- [`ObjectDetectionPipeline`]
|
||||
- [`QuestionAnsweringPipeline`]
|
||||
- [`SummarizationPipeline`]
|
||||
- [`TableQuestionAnsweringPipeline`]
|
||||
- [`TextClassificationPipeline`]
|
||||
- [`TextGenerationPipeline`]
|
||||
- [`Text2TextGenerationPipeline`]
|
||||
- [`TokenClassificationPipeline`]
|
||||
- [`TranslationPipeline`]
|
||||
- [`ZeroShotClassificationPipeline`]
|
||||
|
||||
## The pipeline abstraction
|
||||
|
||||
The *pipeline* abstraction is a wrapper around all the other available pipelines. It is instantiated as any other
|
||||
pipeline but can provide additional quality of life.
|
||||
|
||||
Simple call on one item:
|
||||
|
||||
```python
|
||||
>>> pipe = pipeline("text-classification")
|
||||
>>> pipe("This restaurant is awesome")
|
||||
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]
|
||||
```
|
||||
|
||||
If you want to use a specific model from the [hub](https://huggingface.co) you can ignore the task if the model on
|
||||
the hub already defines it:
|
||||
|
||||
```python
|
||||
>>> pipe = pipeline(model="roberta-large-mnli")
|
||||
>>> pipe("This restaurant is awesome")
|
||||
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]
|
||||
```
|
||||
|
||||
To call a pipeline on many items, you can either call with a *list*.
|
||||
|
||||
```python
|
||||
>>> pipe = pipeline("text-classification")
|
||||
>>> pipe(["This restaurant is awesome", "This restaurant is aweful"])
|
||||
[{'label': 'POSITIVE', 'score': 0.9998743534088135},
|
||||
{'label': 'NEGATIVE', 'score': 0.9996669292449951}]
|
||||
```
|
||||
|
||||
To iterate of full datasets it is recommended to use a `dataset` directly. This means you don't need to allocate
|
||||
the whole dataset at once, nor do you need to do batching yourself. This should work just as fast as custom loops on
|
||||
GPU. If it doesn't don't hesitate to create an issue.
|
||||
|
||||
```python
|
||||
import datasets
|
||||
from transformers import pipeline
|
||||
from transformers.pipelines.base import KeyDataset
|
||||
import tqdm
|
||||
|
||||
pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
|
||||
dataset = datasets.load_dataset("superb", name="asr", split="test")
|
||||
|
||||
# KeyDataset (only *pt*) will simply return the item in the dict returned by the dataset item
|
||||
# as we're not interested in the *target* part of the dataset.
|
||||
for out in tqdm.tqdm(pipe(KeyDataset(dataset, "file"))):
|
||||
print(out)
|
||||
# {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
|
||||
# {"text": ....}
|
||||
# ....
|
||||
```
|
||||
|
||||
[[autodoc]] pipeline
|
||||
|
||||
## Pipeline batching
|
||||
|
||||
All pipelines (except *zero-shot-classification* and *question-answering* currently) can use batching. This will work
|
||||
whenever the pipeline uses its streaming ability (so when passing lists or `Dataset`).
|
||||
|
||||
```python
|
||||
from transformers import pipeline
|
||||
from transformers.pipelines.base import KeyDataset
|
||||
import datasets
|
||||
import tqdm
|
||||
|
||||
dataset = datasets.load_dataset("imdb", name="plain_text", split="unsupervised")
|
||||
pipe = pipeline("text-classification", device=0)
|
||||
for out in pipe(KeyDataset(dataset, "text"), batch_size=8, truncation="only_first"):
|
||||
print(out)
|
||||
# [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
|
||||
# Exactly the same output as before, but the content are passed
|
||||
# as batches to the model
|
||||
```
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
However, this is not automatically a win for performance. It can be either a 10x speedup or 5x slowdown depending
|
||||
on hardware, data and the actual model being used.
|
||||
|
||||
Example where it's most a speedup:
|
||||
|
||||
</Tip>
|
||||
|
||||
```python
|
||||
from transformers import pipeline
|
||||
from torch.utils.data import Dataset
|
||||
import tqdm
|
||||
|
||||
|
||||
pipe = pipeline("text-classification", device=0)
|
||||
|
||||
|
||||
class MyDataset(Dataset):
|
||||
def __len__(self):
|
||||
return 5000
|
||||
|
||||
def __getitem__(self, i):
|
||||
return "This is a test"
|
||||
|
||||
|
||||
dataset = MyDataset()
|
||||
|
||||
for batch_size in [1, 8, 64, 256]:
|
||||
print("-" * 30)
|
||||
print(f"Streaming batch_size={batch_size}")
|
||||
for out in tqdm.tqdm(pipe(dataset, batch_size=batch_size), total=len(dataset)):
|
||||
pass
|
||||
```
|
||||
|
||||
```
|
||||
# On GTX 970
|
||||
------------------------------
|
||||
Streaming no batching
|
||||
100%|██████████████████████████████████████████████████████████████████████| 5000/5000 [00:26<00:00, 187.52it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=8
|
||||
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:04<00:00, 1205.95it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=64
|
||||
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:02<00:00, 2478.24it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=256
|
||||
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:01<00:00, 2554.43it/s]
|
||||
(diminishing returns, saturated the GPU)
|
||||
```
|
||||
|
||||
Example where it's most a slowdown:
|
||||
|
||||
```python
|
||||
class MyDataset(Dataset):
|
||||
def __len__(self):
|
||||
return 5000
|
||||
|
||||
def __getitem__(self, i):
|
||||
if i % 64 == 0:
|
||||
n = 100
|
||||
else:
|
||||
n = 1
|
||||
return "This is a test" * n
|
||||
```
|
||||
|
||||
This is a occasional very long sentence compared to the other. In that case, the **whole** batch will need to be 400
|
||||
tokens long, so the whole batch will be [64, 400] instead of [64, 4], leading to the high slowdown. Even worse, on
|
||||
bigger batches, the program simply crashes.
|
||||
|
||||
|
||||
```
|
||||
------------------------------
|
||||
Streaming no batching
|
||||
100%|█████████████████████████████████████████████████████████████████████| 1000/1000 [00:05<00:00, 183.69it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=8
|
||||
100%|█████████████████████████████████████████████████████████████████████| 1000/1000 [00:03<00:00, 265.74it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=64
|
||||
100%|██████████████████████████████████████████████████████████████████████| 1000/1000 [00:26<00:00, 37.80it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=256
|
||||
0%| | 0/1000 [00:00<?, ?it/s]
|
||||
Traceback (most recent call last):
|
||||
File "/home/nicolas/src/transformers/test.py", line 42, in <module>
|
||||
for out in tqdm.tqdm(pipe(dataset, batch_size=256), total=len(dataset)):
|
||||
....
|
||||
q = q / math.sqrt(dim_per_head) # (bs, n_heads, q_length, dim_per_head)
|
||||
RuntimeError: CUDA out of memory. Tried to allocate 376.00 MiB (GPU 0; 3.95 GiB total capacity; 1.72 GiB already allocated; 354.88 MiB free; 2.46 GiB reserved in total by PyTorch)
|
||||
```
|
||||
|
||||
There are no good (general) solutions for this problem, and your mileage may vary depending on your use cases. Rule of
|
||||
thumb:
|
||||
|
||||
For users, a rule of thumb is:
|
||||
|
||||
- **Measure performance on your load, with your hardware. Measure, measure, and keep measuring. Real numbers are the
|
||||
only way to go.**
|
||||
- If you are latency constrained (live product doing inference), don't batch
|
||||
- If you are using CPU, don't batch.
|
||||
- If you are using throughput (you want to run your model on a bunch of static data), on GPU, then:
|
||||
|
||||
- If you have no clue about the size of the sequence_length ("natural" data), by default don't batch, measure and
|
||||
try tentatively to add it, add OOM checks to recover when it will fail (and it will at some point if you don't
|
||||
control the sequence_length.)
|
||||
- If your sequence_length is super regular, then batching is more likely to be VERY interesting, measure and push
|
||||
it until you get OOMs.
|
||||
- The larger the GPU the more likely batching is going to be more interesting
|
||||
- As soon as you enable batching, make sure you can handle OOMs nicely.
|
||||
|
||||
## Pipeline custom code
|
||||
|
||||
If you want to override a specific pipeline.
|
||||
|
||||
Don't hesitate to create an issue for your task at hand, the goal of the pipeline is to be easy to use and support most
|
||||
cases, so `transformers` could maybe support your use case.
|
||||
|
||||
|
||||
If you want to try simply you can:
|
||||
|
||||
- Subclass your pipeline of choice
|
||||
|
||||
```python
|
||||
class MyPipeline(TextClassificationPipeline):
|
||||
def postprocess(...):
|
||||
...
|
||||
scores = scores * 100
|
||||
...
|
||||
|
||||
my_pipeline = MyPipeline(model=model, tokenizer=tokenizer, ...)
|
||||
# or if you use *pipeline* function, then:
|
||||
my_pipeline = pipeline(model="xxxx", pipeline_class=MyPipeline)
|
||||
```
|
||||
|
||||
That should enable you to do all the custom code you want.
|
||||
|
||||
|
||||
## Implementing a pipeline
|
||||
|
||||
[Implementing a new pipeline](../add_new_pipeline)
|
||||
|
||||
## The task specific pipelines
|
||||
|
||||
|
||||
### AudioClassificationPipeline
|
||||
|
||||
[[autodoc]] AudioClassificationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### AutomaticSpeechRecognitionPipeline
|
||||
|
||||
[[autodoc]] AutomaticSpeechRecognitionPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### ConversationalPipeline
|
||||
|
||||
[[autodoc]] Conversation
|
||||
|
||||
[[autodoc]] ConversationalPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### FeatureExtractionPipeline
|
||||
|
||||
[[autodoc]] FeatureExtractionPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### FillMaskPipeline
|
||||
|
||||
[[autodoc]] FillMaskPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### ImageClassificationPipeline
|
||||
|
||||
[[autodoc]] ImageClassificationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### ImageSegmentationPipeline
|
||||
|
||||
[[autodoc]] ImageSegmentationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### NerPipeline
|
||||
|
||||
[[autodoc]] NerPipeline
|
||||
|
||||
See [`TokenClassificationPipeline`] for all details.
|
||||
|
||||
### ObjectDetectionPipeline
|
||||
|
||||
[[autodoc]] ObjectDetectionPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### QuestionAnsweringPipeline
|
||||
|
||||
[[autodoc]] QuestionAnsweringPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### SummarizationPipeline
|
||||
|
||||
[[autodoc]] SummarizationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### TableQuestionAnsweringPipeline
|
||||
|
||||
[[autodoc]] TableQuestionAnsweringPipeline
|
||||
- __call__
|
||||
|
||||
### TextClassificationPipeline
|
||||
|
||||
[[autodoc]] TextClassificationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### TextGenerationPipeline
|
||||
|
||||
[[autodoc]] TextGenerationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### Text2TextGenerationPipeline
|
||||
|
||||
[[autodoc]] Text2TextGenerationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### TokenClassificationPipeline
|
||||
|
||||
[[autodoc]] TokenClassificationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### TranslationPipeline
|
||||
|
||||
[[autodoc]] TranslationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
### ZeroShotClassificationPipeline
|
||||
|
||||
[[autodoc]] ZeroShotClassificationPipeline
|
||||
- __call__
|
||||
- all
|
||||
|
||||
## Parent class: `Pipeline`
|
||||
|
||||
[[autodoc]] Pipeline
|
||||
@@ -1,407 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Pipelines
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of
|
||||
the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity
|
||||
Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering. See the
|
||||
:doc:`task summary <../task_summary>` for examples of use.
|
||||
|
||||
There are two categories of pipeline abstractions to be aware about:
|
||||
|
||||
- The :func:`~transformers.pipeline` which is the most powerful object encapsulating all other pipelines.
|
||||
- The other task-specific pipelines:
|
||||
|
||||
- :class:`~transformers.AudioClassificationPipeline`
|
||||
- :class:`~transformers.AutomaticSpeechRecognitionPipeline`
|
||||
- :class:`~transformers.ConversationalPipeline`
|
||||
- :class:`~transformers.FeatureExtractionPipeline`
|
||||
- :class:`~transformers.FillMaskPipeline`
|
||||
- :class:`~transformers.ImageClassificationPipeline`
|
||||
- :class:`~transformers.ImageSegmentationPipeline`
|
||||
- :class:`~transformers.ObjectDetectionPipeline`
|
||||
- :class:`~transformers.QuestionAnsweringPipeline`
|
||||
- :class:`~transformers.SummarizationPipeline`
|
||||
- :class:`~transformers.TableQuestionAnsweringPipeline`
|
||||
- :class:`~transformers.TextClassificationPipeline`
|
||||
- :class:`~transformers.TextGenerationPipeline`
|
||||
- :class:`~transformers.Text2TextGenerationPipeline`
|
||||
- :class:`~transformers.TokenClassificationPipeline`
|
||||
- :class:`~transformers.TranslationPipeline`
|
||||
- :class:`~transformers.ZeroShotClassificationPipeline`
|
||||
|
||||
The pipeline abstraction
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The `pipeline` abstraction is a wrapper around all the other available pipelines. It is instantiated as any other
|
||||
pipeline but can provide additional quality of life.
|
||||
|
||||
Simple call on one item:
|
||||
|
||||
.. code-block::
|
||||
|
||||
>>> pipe = pipeline("text-classification")
|
||||
>>> pipe("This restaurant is awesome")
|
||||
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]
|
||||
|
||||
If you want to use a specific model from the `hub <https://huggingface.co>`__ you can ignore the task if the model on
|
||||
the hub already defines it:
|
||||
|
||||
.. code-block::
|
||||
|
||||
>>> pipe = pipeline(model="roberta-large-mnli")
|
||||
>>> pipe("This restaurant is awesome")
|
||||
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]
|
||||
|
||||
To call a pipeline on many items, you can either call with a `list`.
|
||||
|
||||
.. code-block::
|
||||
|
||||
>>> pipe = pipeline("text-classification")
|
||||
>>> pipe(["This restaurant is awesome", "This restaurant is aweful"])
|
||||
[{'label': 'POSITIVE', 'score': 0.9998743534088135},
|
||||
{'label': 'NEGATIVE', 'score': 0.9996669292449951}]
|
||||
|
||||
|
||||
To iterate of full datasets it is recommended to use a :obj:`dataset` directly. This means you don't need to allocate
|
||||
the whole dataset at once, nor do you need to do batching yourself. This should work just as fast as custom loops on
|
||||
GPU. If it doesn't don't hesitate to create an issue.
|
||||
|
||||
.. code-block::
|
||||
|
||||
import datasets
|
||||
from transformers import pipeline
|
||||
from transformers.pipelines.base import KeyDataset
|
||||
import tqdm
|
||||
|
||||
pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
|
||||
dataset = datasets.load_dataset("superb", name="asr", split="test")
|
||||
|
||||
# KeyDataset (only `pt`) will simply return the item in the dict returned by the dataset item
|
||||
# as we're not interested in the `target` part of the dataset.
|
||||
for out in tqdm.tqdm(pipe(KeyDataset(dataset, "file"))):
|
||||
print(out)
|
||||
# {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
|
||||
# {"text": ....}
|
||||
# ....
|
||||
|
||||
|
||||
.. autofunction:: transformers.pipeline
|
||||
|
||||
Pipeline batching
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
All pipelines (except `zero-shot-classification` and `question-answering` currently) can use batching. This will work
|
||||
whenever the pipeline uses its streaming ability (so when passing lists or :obj:`Dataset`).
|
||||
|
||||
.. code-block::
|
||||
|
||||
from transformers import pipeline
|
||||
from transformers.pipelines.base import KeyDataset
|
||||
import datasets
|
||||
import tqdm
|
||||
|
||||
dataset = datasets.load_dataset("imdb", name="plain_text", split="unsupervised")
|
||||
pipe = pipeline("text-classification", device=0)
|
||||
for out in pipe(KeyDataset(dataset, "text"), batch_size=8, truncation="only_first"):
|
||||
print(out)
|
||||
# [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
|
||||
# Exactly the same output as before, but the content are passed
|
||||
# as batches to the model
|
||||
|
||||
|
||||
.. warning::
|
||||
|
||||
However, this is not automatically a win for performance. It can be either a 10x speedup or 5x slowdown depending
|
||||
on hardware, data and the actual model being used.
|
||||
|
||||
Example where it's most a speedup:
|
||||
|
||||
|
||||
.. code-block::
|
||||
|
||||
from transformers import pipeline
|
||||
from torch.utils.data import Dataset
|
||||
import tqdm
|
||||
|
||||
|
||||
pipe = pipeline("text-classification", device=0)
|
||||
|
||||
|
||||
class MyDataset(Dataset):
|
||||
def __len__(self):
|
||||
return 5000
|
||||
|
||||
def __getitem__(self, i):
|
||||
return "This is a test"
|
||||
|
||||
|
||||
dataset = MyDataset()
|
||||
|
||||
for batch_size in [1, 8, 64, 256]:
|
||||
print("-" * 30)
|
||||
print(f"Streaming batch_size={batch_size}")
|
||||
for out in tqdm.tqdm(pipe(dataset, batch_size=batch_size), total=len(dataset)):
|
||||
pass
|
||||
|
||||
|
||||
.. code-block::
|
||||
|
||||
# On GTX 970
|
||||
------------------------------
|
||||
Streaming no batching
|
||||
100%|██████████████████████████████████████████████████████████████████████| 5000/5000 [00:26<00:00, 187.52it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=8
|
||||
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:04<00:00, 1205.95it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=64
|
||||
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:02<00:00, 2478.24it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=256
|
||||
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:01<00:00, 2554.43it/s]
|
||||
(diminishing returns, saturated the GPU)
|
||||
|
||||
|
||||
Example where it's most a slowdown:
|
||||
|
||||
.. code-block::
|
||||
|
||||
class MyDataset(Dataset):
|
||||
def __len__(self):
|
||||
return 5000
|
||||
|
||||
def __getitem__(self, i):
|
||||
if i % 64 == 0:
|
||||
n = 100
|
||||
else:
|
||||
n = 1
|
||||
return "This is a test" * n
|
||||
|
||||
This is a occasional very long sentence compared to the other. In that case, the **whole** batch will need to be 400
|
||||
tokens long, so the whole batch will be [64, 400] instead of [64, 4], leading to the high slowdown. Even worse, on
|
||||
bigger batches, the program simply crashes.
|
||||
|
||||
|
||||
.. code-block::
|
||||
|
||||
------------------------------
|
||||
Streaming no batching
|
||||
100%|█████████████████████████████████████████████████████████████████████| 1000/1000 [00:05<00:00, 183.69it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=8
|
||||
100%|█████████████████████████████████████████████████████████████████████| 1000/1000 [00:03<00:00, 265.74it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=64
|
||||
100%|██████████████████████████████████████████████████████████████████████| 1000/1000 [00:26<00:00, 37.80it/s]
|
||||
------------------------------
|
||||
Streaming batch_size=256
|
||||
0%| | 0/1000 [00:00<?, ?it/s]
|
||||
Traceback (most recent call last):
|
||||
File "/home/nicolas/src/transformers/test.py", line 42, in <module>
|
||||
for out in tqdm.tqdm(pipe(dataset, batch_size=256), total=len(dataset)):
|
||||
....
|
||||
q = q / math.sqrt(dim_per_head) # (bs, n_heads, q_length, dim_per_head)
|
||||
RuntimeError: CUDA out of memory. Tried to allocate 376.00 MiB (GPU 0; 3.95 GiB total capacity; 1.72 GiB already allocated; 354.88 MiB free; 2.46 GiB reserved in total by PyTorch)
|
||||
|
||||
|
||||
There are no good (general) solutions for this problem, and your mileage may vary depending on your use cases. Rule of
|
||||
thumb:
|
||||
|
||||
For users, a rule of thumb is:
|
||||
|
||||
- **Measure performance on your load, with your hardware. Measure, measure, and keep measuring. Real numbers are the
|
||||
only way to go.**
|
||||
- If you are latency constrained (live product doing inference), don't batch
|
||||
- If you are using CPU, don't batch.
|
||||
- If you are using throughput (you want to run your model on a bunch of static data), on GPU, then:
|
||||
|
||||
- If you have no clue about the size of the sequence_length ("natural" data), by default don't batch, measure and
|
||||
try tentatively to add it, add OOM checks to recover when it will fail (and it will at some point if you don't
|
||||
control the sequence_length.)
|
||||
- If your sequence_length is super regular, then batching is more likely to be VERY interesting, measure and push
|
||||
it until you get OOMs.
|
||||
- The larger the GPU the more likely batching is going to be more interesting
|
||||
- As soon as you enable batching, make sure you can handle OOMs nicely.
|
||||
|
||||
Pipeline custom code
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
If you want to override a specific pipeline.
|
||||
|
||||
Don't hesitate to create an issue for your task at hand, the goal of the pipeline is to be easy to use and support most
|
||||
cases, so :obj:`transformers` could maybe support your use case.
|
||||
|
||||
|
||||
If you want to try simply you can:
|
||||
|
||||
- Subclass your pipeline of choice
|
||||
|
||||
.. code-block::
|
||||
|
||||
class MyPipeline(TextClassificationPipeline):
|
||||
def postprocess(...):
|
||||
...
|
||||
scores = scores * 100
|
||||
...
|
||||
|
||||
my_pipeline = MyPipeline(model=model, tokenizer=tokenizer, ...)
|
||||
# or if you use `pipeline` function, then:
|
||||
my_pipeline = pipeline(model="xxxx", pipeline_class=MyPipeline)
|
||||
|
||||
That should enable you to do all the custom code you want.
|
||||
|
||||
|
||||
Implementing a pipeline
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
:doc:`Implementing a new pipeline <../add_new_pipeline>`
|
||||
|
||||
The task specific pipelines
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
||||
AudioClassificationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.AudioClassificationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
AutomaticSpeechRecognitionPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.AutomaticSpeechRecognitionPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
ConversationalPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.Conversation
|
||||
|
||||
.. autoclass:: transformers.ConversationalPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
FeatureExtractionPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.FeatureExtractionPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
FillMaskPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.FillMaskPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
ImageClassificationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.ImageClassificationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
ImageSegmentationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.ImageSegmentationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
NerPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.NerPipeline
|
||||
|
||||
See :class:`~transformers.TokenClassificationPipeline` for all details.
|
||||
|
||||
ObjectDetectionPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.ObjectDetectionPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
QuestionAnsweringPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.QuestionAnsweringPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
SummarizationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.SummarizationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
TableQuestionAnsweringPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.TableQuestionAnsweringPipeline
|
||||
:special-members: __call__
|
||||
|
||||
|
||||
TextClassificationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.TextClassificationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
TextGenerationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.TextGenerationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
Text2TextGenerationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.Text2TextGenerationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
TokenClassificationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.TokenClassificationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
TranslationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.TranslationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
ZeroShotClassificationPipeline
|
||||
=======================================================================================================================
|
||||
|
||||
.. autoclass:: transformers.ZeroShotClassificationPipeline
|
||||
:special-members: __call__
|
||||
:members:
|
||||
|
||||
Parent class: :obj:`Pipeline`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.Pipeline
|
||||
:members:
|
||||
152
docs/source/main_classes/processors.mdx
Normal file
152
docs/source/main_classes/processors.mdx
Normal file
@@ -0,0 +1,152 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Processors
|
||||
|
||||
This library includes processors for several traditional tasks. These processors can be used to process a dataset into
|
||||
examples that can be fed to a model.
|
||||
|
||||
## Processors
|
||||
|
||||
All processors follow the same architecture which is that of the
|
||||
[`~data.processors.utils.DataProcessor`]. The processor returns a list of
|
||||
[`~data.processors.utils.InputExample`]. These
|
||||
[`~data.processors.utils.InputExample`] can be converted to
|
||||
[`~data.processors.utils.InputFeatures`] in order to be fed to the model.
|
||||
|
||||
[[autodoc]] data.processors.utils.DataProcessor
|
||||
|
||||
[[autodoc]] data.processors.utils.InputExample
|
||||
|
||||
[[autodoc]] data.processors.utils.InputFeatures
|
||||
|
||||
## GLUE
|
||||
|
||||
[General Language Understanding Evaluation (GLUE)](https://gluebenchmark.com/) is a benchmark that evaluates the
|
||||
performance of models across a diverse set of existing NLU tasks. It was released together with the paper [GLUE: A
|
||||
multi-task benchmark and analysis platform for natural language understanding](https://openreview.net/pdf?id=rJ4km2R5t7)
|
||||
|
||||
This library hosts a total of 10 processors for the following tasks: MRPC, MNLI, MNLI (mismatched), CoLA, SST2, STSB,
|
||||
QQP, QNLI, RTE and WNLI.
|
||||
|
||||
Those processors are:
|
||||
|
||||
- [`~data.processors.utils.MrpcProcessor`]
|
||||
- [`~data.processors.utils.MnliProcessor`]
|
||||
- [`~data.processors.utils.MnliMismatchedProcessor`]
|
||||
- [`~data.processors.utils.Sst2Processor`]
|
||||
- [`~data.processors.utils.StsbProcessor`]
|
||||
- [`~data.processors.utils.QqpProcessor`]
|
||||
- [`~data.processors.utils.QnliProcessor`]
|
||||
- [`~data.processors.utils.RteProcessor`]
|
||||
- [`~data.processors.utils.WnliProcessor`]
|
||||
|
||||
Additionally, the following method can be used to load values from a data file and convert them to a list of
|
||||
[`~data.processors.utils.InputExample`].
|
||||
|
||||
automethod,transformers.data.processors.glue.glue_convert_examples_to_features
|
||||
|
||||
|
||||
### Example usage
|
||||
|
||||
An example using these processors is given in the [run_glue.py](https://github.com/huggingface/transformers/tree/master/examples/legacy/text-classification/run_glue.py) script.
|
||||
|
||||
|
||||
## XNLI
|
||||
|
||||
[The Cross-Lingual NLI Corpus (XNLI)](https://www.nyu.edu/projects/bowman/xnli/) is a benchmark that evaluates the
|
||||
quality of cross-lingual text representations. XNLI is crowd-sourced dataset based on [*MultiNLI*](http://www.nyu.edu/projects/bowman/multinli/): pairs of text are labeled with textual entailment annotations for 15
|
||||
different languages (including both high-resource language such as English and low-resource languages such as Swahili).
|
||||
|
||||
It was released together with the paper [XNLI: Evaluating Cross-lingual Sentence Representations](https://arxiv.org/abs/1809.05053)
|
||||
|
||||
This library hosts the processor to load the XNLI data:
|
||||
|
||||
- [`~data.processors.utils.XnliProcessor`]
|
||||
|
||||
Please note that since the gold labels are available on the test set, evaluation is performed on the test set.
|
||||
|
||||
An example using these processors is given in the [run_xnli.py](https://github.com/huggingface/transformers/tree/master/examples/legacy/text-classification/run_xnli.py) script.
|
||||
|
||||
|
||||
## SQuAD
|
||||
|
||||
[The Stanford Question Answering Dataset (SQuAD)](https://rajpurkar.github.io/SQuAD-explorer//) is a benchmark that
|
||||
evaluates the performance of models on question answering. Two versions are available, v1.1 and v2.0. The first version
|
||||
(v1.1) was released together with the paper [SQuAD: 100,000+ Questions for Machine Comprehension of Text](https://arxiv.org/abs/1606.05250). The second version (v2.0) was released alongside the paper [Know What You Don't
|
||||
Know: Unanswerable Questions for SQuAD](https://arxiv.org/abs/1806.03822).
|
||||
|
||||
This library hosts a processor for each of the two versions:
|
||||
|
||||
### Processors
|
||||
|
||||
Those processors are:
|
||||
|
||||
- [`~data.processors.utils.SquadV1Processor`]
|
||||
- [`~data.processors.utils.SquadV2Processor`]
|
||||
|
||||
They both inherit from the abstract class [`~data.processors.utils.SquadProcessor`]
|
||||
|
||||
[[autodoc]] data.processors.squad.SquadProcessor
|
||||
- all
|
||||
|
||||
Additionally, the following method can be used to convert SQuAD examples into
|
||||
[`~data.processors.utils.SquadFeatures`] that can be used as model inputs.
|
||||
|
||||
automethod,transformers.data.processors.squad.squad_convert_examples_to_features
|
||||
|
||||
|
||||
These processors as well as the aforementionned method can be used with files containing the data as well as with the
|
||||
*tensorflow_datasets* package. Examples are given below.
|
||||
|
||||
|
||||
### Example usage
|
||||
|
||||
Here is an example using the processors as well as the conversion method using data files:
|
||||
|
||||
```python
|
||||
# Loading a V2 processor
|
||||
processor = SquadV2Processor()
|
||||
examples = processor.get_dev_examples(squad_v2_data_dir)
|
||||
|
||||
# Loading a V1 processor
|
||||
processor = SquadV1Processor()
|
||||
examples = processor.get_dev_examples(squad_v1_data_dir)
|
||||
|
||||
features = squad_convert_examples_to_features(
|
||||
examples=examples,
|
||||
tokenizer=tokenizer,
|
||||
max_seq_length=max_seq_length,
|
||||
doc_stride=args.doc_stride,
|
||||
max_query_length=max_query_length,
|
||||
is_training=not evaluate,
|
||||
)
|
||||
```
|
||||
|
||||
Using *tensorflow_datasets* is as easy as using a data file:
|
||||
|
||||
```python
|
||||
# tensorflow_datasets only handle Squad V1.
|
||||
tfds_examples = tfds.load("squad")
|
||||
examples = SquadV1Processor().get_examples_from_dataset(tfds_examples, evaluate=evaluate)
|
||||
|
||||
features = squad_convert_examples_to_features(
|
||||
examples=examples,
|
||||
tokenizer=tokenizer,
|
||||
max_seq_length=max_seq_length,
|
||||
doc_stride=args.doc_stride,
|
||||
max_query_length=max_query_length,
|
||||
is_training=not evaluate,
|
||||
)
|
||||
```
|
||||
|
||||
Another example using these processors is given in the [run_squad.py](https://github.com/huggingface/transformers/tree/master/examples/legacy/question-answering/run_squad.py) script.
|
||||
@@ -1,172 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Processors
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
This library includes processors for several traditional tasks. These processors can be used to process a dataset into
|
||||
examples that can be fed to a model.
|
||||
|
||||
Processors
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
All processors follow the same architecture which is that of the
|
||||
:class:`~transformers.data.processors.utils.DataProcessor`. The processor returns a list of
|
||||
:class:`~transformers.data.processors.utils.InputExample`. These
|
||||
:class:`~transformers.data.processors.utils.InputExample` can be converted to
|
||||
:class:`~transformers.data.processors.utils.InputFeatures` in order to be fed to the model.
|
||||
|
||||
.. autoclass:: transformers.data.processors.utils.DataProcessor
|
||||
:members:
|
||||
|
||||
|
||||
.. autoclass:: transformers.data.processors.utils.InputExample
|
||||
:members:
|
||||
|
||||
|
||||
.. autoclass:: transformers.data.processors.utils.InputFeatures
|
||||
:members:
|
||||
|
||||
|
||||
GLUE
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`General Language Understanding Evaluation (GLUE) <https://gluebenchmark.com/>`__ is a benchmark that evaluates the
|
||||
performance of models across a diverse set of existing NLU tasks. It was released together with the paper `GLUE: A
|
||||
multi-task benchmark and analysis platform for natural language understanding
|
||||
<https://openreview.net/pdf?id=rJ4km2R5t7>`__
|
||||
|
||||
This library hosts a total of 10 processors for the following tasks: MRPC, MNLI, MNLI (mismatched), CoLA, SST2, STSB,
|
||||
QQP, QNLI, RTE and WNLI.
|
||||
|
||||
Those processors are:
|
||||
|
||||
- :class:`~transformers.data.processors.utils.MrpcProcessor`
|
||||
- :class:`~transformers.data.processors.utils.MnliProcessor`
|
||||
- :class:`~transformers.data.processors.utils.MnliMismatchedProcessor`
|
||||
- :class:`~transformers.data.processors.utils.Sst2Processor`
|
||||
- :class:`~transformers.data.processors.utils.StsbProcessor`
|
||||
- :class:`~transformers.data.processors.utils.QqpProcessor`
|
||||
- :class:`~transformers.data.processors.utils.QnliProcessor`
|
||||
- :class:`~transformers.data.processors.utils.RteProcessor`
|
||||
- :class:`~transformers.data.processors.utils.WnliProcessor`
|
||||
|
||||
Additionally, the following method can be used to load values from a data file and convert them to a list of
|
||||
:class:`~transformers.data.processors.utils.InputExample`.
|
||||
|
||||
.. automethod:: transformers.data.processors.glue.glue_convert_examples_to_features
|
||||
|
||||
Example usage
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
An example using these processors is given in the :prefix_link:`run_glue.py
|
||||
<examples/legacy/text-classification/run_glue.py>` script.
|
||||
|
||||
|
||||
XNLI
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`The Cross-Lingual NLI Corpus (XNLI) <https://www.nyu.edu/projects/bowman/xnli/>`__ is a benchmark that evaluates the
|
||||
quality of cross-lingual text representations. XNLI is crowd-sourced dataset based on `MultiNLI
|
||||
<http://www.nyu.edu/projects/bowman/multinli/>`: pairs of text are labeled with textual entailment annotations for 15
|
||||
different languages (including both high-resource language such as English and low-resource languages such as Swahili).
|
||||
|
||||
It was released together with the paper `XNLI: Evaluating Cross-lingual Sentence Representations
|
||||
<https://arxiv.org/abs/1809.05053>`__
|
||||
|
||||
This library hosts the processor to load the XNLI data:
|
||||
|
||||
- :class:`~transformers.data.processors.utils.XnliProcessor`
|
||||
|
||||
Please note that since the gold labels are available on the test set, evaluation is performed on the test set.
|
||||
|
||||
An example using these processors is given in the :prefix_link:`run_xnli.py
|
||||
<examples/legacy/text-classification/run_xnli.py>` script.
|
||||
|
||||
|
||||
SQuAD
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`The Stanford Question Answering Dataset (SQuAD) <https://rajpurkar.github.io/SQuAD-explorer//>`__ is a benchmark that
|
||||
evaluates the performance of models on question answering. Two versions are available, v1.1 and v2.0. The first version
|
||||
(v1.1) was released together with the paper `SQuAD: 100,000+ Questions for Machine Comprehension of Text
|
||||
<https://arxiv.org/abs/1606.05250>`__. The second version (v2.0) was released alongside the paper `Know What You Don't
|
||||
Know: Unanswerable Questions for SQuAD <https://arxiv.org/abs/1806.03822>`__.
|
||||
|
||||
This library hosts a processor for each of the two versions:
|
||||
|
||||
Processors
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Those processors are:
|
||||
|
||||
- :class:`~transformers.data.processors.utils.SquadV1Processor`
|
||||
- :class:`~transformers.data.processors.utils.SquadV2Processor`
|
||||
|
||||
They both inherit from the abstract class :class:`~transformers.data.processors.utils.SquadProcessor`
|
||||
|
||||
.. autoclass:: transformers.data.processors.squad.SquadProcessor
|
||||
:members:
|
||||
|
||||
Additionally, the following method can be used to convert SQuAD examples into
|
||||
:class:`~transformers.data.processors.utils.SquadFeatures` that can be used as model inputs.
|
||||
|
||||
.. automethod:: transformers.data.processors.squad.squad_convert_examples_to_features
|
||||
|
||||
These processors as well as the aforementionned method can be used with files containing the data as well as with the
|
||||
`tensorflow_datasets` package. Examples are given below.
|
||||
|
||||
|
||||
Example usage
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Here is an example using the processors as well as the conversion method using data files:
|
||||
|
||||
.. code-block::
|
||||
|
||||
# Loading a V2 processor
|
||||
processor = SquadV2Processor()
|
||||
examples = processor.get_dev_examples(squad_v2_data_dir)
|
||||
|
||||
# Loading a V1 processor
|
||||
processor = SquadV1Processor()
|
||||
examples = processor.get_dev_examples(squad_v1_data_dir)
|
||||
|
||||
features = squad_convert_examples_to_features(
|
||||
examples=examples,
|
||||
tokenizer=tokenizer,
|
||||
max_seq_length=max_seq_length,
|
||||
doc_stride=args.doc_stride,
|
||||
max_query_length=max_query_length,
|
||||
is_training=not evaluate,
|
||||
)
|
||||
|
||||
Using `tensorflow_datasets` is as easy as using a data file:
|
||||
|
||||
.. code-block::
|
||||
|
||||
# tensorflow_datasets only handle Squad V1.
|
||||
tfds_examples = tfds.load("squad")
|
||||
examples = SquadV1Processor().get_examples_from_dataset(tfds_examples, evaluate=evaluate)
|
||||
|
||||
features = squad_convert_examples_to_features(
|
||||
examples=examples,
|
||||
tokenizer=tokenizer,
|
||||
max_seq_length=max_seq_length,
|
||||
doc_stride=args.doc_stride,
|
||||
max_query_length=max_query_length,
|
||||
is_training=not evaluate,
|
||||
)
|
||||
|
||||
|
||||
Another example using these processors is given in the :prefix_link:`run_squad.py
|
||||
<examples/legacy/question-answering/run_squad.py>` script.
|
||||
77
docs/source/main_classes/tokenizer.mdx
Normal file
77
docs/source/main_classes/tokenizer.mdx
Normal file
@@ -0,0 +1,77 @@
|
||||
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Tokenizer
|
||||
|
||||
A tokenizer is in charge of preparing the inputs for a model. The library contains tokenizers for all the models. Most
|
||||
of the tokenizers are available in two flavors: a full python implementation and a "Fast" implementation based on the
|
||||
Rust library [🤗 Tokenizers](https://github.com/huggingface/tokenizers). The "Fast" implementations allows:
|
||||
|
||||
1. a significant speed-up in particular when doing batched tokenization and
|
||||
2. additional methods to map between the original string (character and words) and the token space (e.g. getting the
|
||||
index of the token comprising a given character or the span of characters corresponding to a given token). Currently
|
||||
no "Fast" implementation is available for the SentencePiece-based tokenizers (for T5, ALBERT, CamemBERT, XLM-RoBERTa
|
||||
and XLNet models).
|
||||
|
||||
The base classes [`PreTrainedTokenizer`] and [`PreTrainedTokenizerFast`]
|
||||
implement the common methods for encoding string inputs in model inputs (see below) and instantiating/saving python and
|
||||
"Fast" tokenizers either from a local file or directory or from a pretrained tokenizer provided by the library
|
||||
(downloaded from HuggingFace's AWS S3 repository). They both rely on
|
||||
[`~tokenization_utils_base.PreTrainedTokenizerBase`] that contains the common methods, and
|
||||
[`~tokenization_utils_base.SpecialTokensMixin`].
|
||||
|
||||
[`PreTrainedTokenizer`] and [`PreTrainedTokenizerFast`] thus implement the main
|
||||
methods for using all the tokenizers:
|
||||
|
||||
- Tokenizing (splitting strings in sub-word token strings), converting tokens strings to ids and back, and
|
||||
encoding/decoding (i.e., tokenizing and converting to integers).
|
||||
- Adding new tokens to the vocabulary in a way that is independent of the underlying structure (BPE, SentencePiece...).
|
||||
- Managing special tokens (like mask, beginning-of-sentence, etc.): adding them, assigning them to attributes in the
|
||||
tokenizer for easy access and making sure they are not split during tokenization.
|
||||
|
||||
[`BatchEncoding`] holds the output of the
|
||||
[`~tokenization_utils_base.PreTrainedTokenizerBase`]'s encoding methods (`__call__`,
|
||||
`encode_plus` and `batch_encode_plus`) and is derived from a Python dictionary. When the tokenizer is a pure python
|
||||
tokenizer, this class behaves just like a standard python dictionary and holds the various model inputs computed by
|
||||
these methods (`input_ids`, `attention_mask`...). When the tokenizer is a "Fast" tokenizer (i.e., backed by
|
||||
HuggingFace [tokenizers library](https://github.com/huggingface/tokenizers)), this class provides in addition
|
||||
several advanced alignment methods which can be used to map between the original string (character and words) and the
|
||||
token space (e.g., getting the index of the token comprising a given character or the span of characters corresponding
|
||||
to a given token).
|
||||
|
||||
|
||||
## PreTrainedTokenizer
|
||||
|
||||
[[autodoc]] PreTrainedTokenizer
|
||||
- __call__
|
||||
- batch_decode
|
||||
- decode
|
||||
- encode
|
||||
- push_to_hub
|
||||
- all
|
||||
|
||||
## PreTrainedTokenizerFast
|
||||
|
||||
The [`PreTrainedTokenizerFast`] depend on the [tokenizers](https://huggingface.co/docs/tokenizers) library. The tokenizers obtained from the 🤗 tokenizers library can be
|
||||
loaded very simply into 🤗 transformers. Take a look at the [Using tokenizers from 🤗 tokenizers](../fast_tokenizers) page to understand how this is done.
|
||||
|
||||
[[autodoc]] PreTrainedTokenizerFast
|
||||
- __call__
|
||||
- batch_decode
|
||||
- decode
|
||||
- encode
|
||||
- push_to_hub
|
||||
- all
|
||||
|
||||
## BatchEncoding
|
||||
|
||||
[[autodoc]] BatchEncoding
|
||||
@@ -1,78 +0,0 @@
|
||||
..
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
|
||||
Tokenizer
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
A tokenizer is in charge of preparing the inputs for a model. The library contains tokenizers for all the models. Most
|
||||
of the tokenizers are available in two flavors: a full python implementation and a "Fast" implementation based on the
|
||||
Rust library `tokenizers <https://github.com/huggingface/tokenizers>`__. The "Fast" implementations allows:
|
||||
|
||||
1. a significant speed-up in particular when doing batched tokenization and
|
||||
2. additional methods to map between the original string (character and words) and the token space (e.g. getting the
|
||||
index of the token comprising a given character or the span of characters corresponding to a given token). Currently
|
||||
no "Fast" implementation is available for the SentencePiece-based tokenizers (for T5, ALBERT, CamemBERT, XLM-RoBERTa
|
||||
and XLNet models).
|
||||
|
||||
The base classes :class:`~transformers.PreTrainedTokenizer` and :class:`~transformers.PreTrainedTokenizerFast`
|
||||
implement the common methods for encoding string inputs in model inputs (see below) and instantiating/saving python and
|
||||
"Fast" tokenizers either from a local file or directory or from a pretrained tokenizer provided by the library
|
||||
(downloaded from HuggingFace's AWS S3 repository). They both rely on
|
||||
:class:`~transformers.tokenization_utils_base.PreTrainedTokenizerBase` that contains the common methods, and
|
||||
:class:`~transformers.tokenization_utils_base.SpecialTokensMixin`.
|
||||
|
||||
:class:`~transformers.PreTrainedTokenizer` and :class:`~transformers.PreTrainedTokenizerFast` thus implement the main
|
||||
methods for using all the tokenizers:
|
||||
|
||||
- Tokenizing (splitting strings in sub-word token strings), converting tokens strings to ids and back, and
|
||||
encoding/decoding (i.e., tokenizing and converting to integers).
|
||||
- Adding new tokens to the vocabulary in a way that is independent of the underlying structure (BPE, SentencePiece...).
|
||||
- Managing special tokens (like mask, beginning-of-sentence, etc.): adding them, assigning them to attributes in the
|
||||
tokenizer for easy access and making sure they are not split during tokenization.
|
||||
|
||||
:class:`~transformers.BatchEncoding` holds the output of the
|
||||
:class:`~transformers.tokenization_utils_base.PreTrainedTokenizerBase`'s encoding methods (``__call__``,
|
||||
``encode_plus`` and ``batch_encode_plus``) and is derived from a Python dictionary. When the tokenizer is a pure python
|
||||
tokenizer, this class behaves just like a standard python dictionary and holds the various model inputs computed by
|
||||
these methods (``input_ids``, ``attention_mask``...). When the tokenizer is a "Fast" tokenizer (i.e., backed by
|
||||
HuggingFace `tokenizers library <https://github.com/huggingface/tokenizers>`__), this class provides in addition
|
||||
several advanced alignment methods which can be used to map between the original string (character and words) and the
|
||||
token space (e.g., getting the index of the token comprising a given character or the span of characters corresponding
|
||||
to a given token).
|
||||
|
||||
|
||||
PreTrainedTokenizer
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.PreTrainedTokenizer
|
||||
:special-members: __call__, batch_decode, decode, encode, push_to_hub
|
||||
:members:
|
||||
|
||||
|
||||
PreTrainedTokenizerFast
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The :class:`~transformers.PreTrainedTokenizerFast` depend on the `tokenizers
|
||||
<https://huggingface.co/docs/tokenizers>`__ library. The tokenizers obtained from the 🤗 tokenizers library can be
|
||||
loaded very simply into 🤗 transformers. Take a look at the :doc:`Using tokenizers from 🤗 tokenizers
|
||||
<../fast_tokenizers>` page to understand how this is done.
|
||||
|
||||
.. autoclass:: transformers.PreTrainedTokenizerFast
|
||||
:special-members: __call__, batch_decode, decode, encode, push_to_hub
|
||||
:members:
|
||||
|
||||
|
||||
BatchEncoding
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.BatchEncoding
|
||||
:members:
|
||||
Reference in New Issue
Block a user