From 88ac60f7b5f6d4b62245dc21653ea3d5db7d4935 Mon Sep 17 00:00:00 2001 From: Hamel Husain Date: Mon, 26 Apr 2021 19:18:37 -0700 Subject: [PATCH] update QuickTour docs to reflect model output object (#11462) * update docs to reflect model output object * run make style` --- docs/source/main_classes/output.rst | 4 ++-- docs/source/quicktour.rst | 21 ++++++++++----------- 2 files changed, 12 insertions(+), 13 deletions(-) diff --git a/docs/source/main_classes/output.rst b/docs/source/main_classes/output.rst index 7b7c05568a..a627571f24 100644 --- a/docs/source/main_classes/output.rst +++ b/docs/source/main_classes/output.rst @@ -13,8 +13,8 @@ Model outputs ----------------------------------------------------------------------------------------------------------------------- -PyTorch models have outputs that are instances of subclasses of :class:`~transformers.file_utils.ModelOutput`. Those -are data structures containing all the information returned by the model, but that can also be used as tuples or +All models have outputs that are instances of subclasses of :class:`~transformers.file_utils.ModelOutput`. Those are +data structures containing all the information returned by the model, but that can also be used as tuples or dictionaries. Let's see of this looks on an example: diff --git a/docs/source/quicktour.rst b/docs/source/quicktour.rst index 51d962b79b..b3005b59e8 100644 --- a/docs/source/quicktour.rst +++ b/docs/source/quicktour.rst @@ -238,23 +238,22 @@ keys directly to tensors, for a PyTorch model, you need to unpack the dictionary >>> ## TENSORFLOW CODE >>> tf_outputs = tf_model(tf_batch) -In 🤗 Transformers, all outputs are tuples (with only one element potentially). Here, we get a tuple with just the final -activations of the model. +In 🤗 Transformers, all outputs are objects that contain the model's final activations along with other metadata. These +objects are described in greater detail :doc:`here `. For now, let's inspect the output ourselves: .. code-block:: >>> ## PYTORCH CODE >>> print(pt_outputs) - (tensor([[-4.0833, 4.3364], - [ 0.0818, -0.0418]], grad_fn=),) + SequenceClassifierOutput(loss=None, logits=tensor([[-4.0833, 4.3364], + [ 0.0818, -0.0418]], grad_fn=), hidden_states=None, attentions=None) >>> ## TENSORFLOW CODE >>> print(tf_outputs) - (,) + TFSequenceClassifierOutput(loss=None, logits=, hidden_states=None, attentions=None) -The model can return more than just the final activations, which is why the output is a tuple. Here we only asked for -the final activations, so we get a tuple with one element. +Notice how the output object has a ``logits`` attribute. You can use this to access the model's final activations. .. note:: @@ -267,10 +266,10 @@ Let's apply the SoftMax activation to get predictions. >>> ## PYTORCH CODE >>> import torch.nn.functional as F - >>> pt_predictions = F.softmax(pt_outputs[0], dim=-1) + >>> pt_predictions = F.softmax(pt_outputs.logits, dim=-1) >>> ## TENSORFLOW CODE >>> import tensorflow as tf - >>> tf_predictions = tf.nn.softmax(tf_outputs[0], axis=-1) + >>> tf.nn.softmax(tf_outputs.logits, axis=-1) We can see we get the numbers from before: