Update doc to new model outputs (#5946)

* Update doc to new model outputs * Fix outputs in quicktour
2020-07-21 18:13:55 -04:00
parent ddd40b3211
commit e714412fe6
16 changed files with 73 additions and 47 deletions
--- a/docs/source/quicktour.rst
+++ b/docs/source/quicktour.rst
@@ -230,13 +230,18 @@ final activations of the model.

    >>> ## PYTORCH CODE
    >>> print(pt_outputs)
-    (tensor([[-4.0833,  4.3364],
-            [ 0.0818, -0.0418]], grad_fn=<AddmmBackward>),)
+    SequenceClassifierOutput(loss=None, logits=tensor([[-4.0833,  4.3364],
+            [ 0.0818, -0.0418]], grad_fn=<AddmmBackward>), hidden_states=None, attentions=None)
    >>> ## TENSORFLOW CODE
    >>> print(tf_outputs)
    (<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
-    array([[-4.0832963 ,  4.3364134 ],
-           [ 0.08181238, -0.04178794]], dtype=float32)>,)
+    array([[-4.0832963 ,  4.336414  ],
+           [ 0.08181786, -0.04179301]], dtype=float32)>,)
+
+The model can return more than just the final activations, which is why the PyTorch output is a special class and the
+TensorFlow output is a tuple. Here we only asked for the final activations, so we get a tuple with one element on the
+TensorFlow side and a :class:`~transformers.modeling_outputs.SequenceClassifierOutput` with just the ``logits`` field
+filled on the PyTorch side.

 .. note::

@@ -249,7 +254,7 @@ Let's apply the SoftMax activation to get predictions.

    >>> ## PYTORCH CODE
    >>> import torch.nn.functional as F
-    >>> pt_predictions = F.softmax(pt_outputs[0], dim=-1)
+    >>> pt_predictions = F.softmax(pt_outputs.logits, dim=-1)
    >>> ## TENSORFLOW CODE
    >>> import tensorflow as tf
    >>> tf_predictions = tf.nn.softmax(tf_outputs[0], axis=-1)
@@ -262,7 +267,7 @@ We can see we get the numbers from before:
    >>> print(tf_predictions)
    tf.Tensor(
    [[2.2042994e-04 9.9977952e-01]
-     [5.3086078e-01 4.6913919e-01]], shape=(2, 2), dtype=float32)
+     [5.3086340e-01 4.6913657e-01]], shape=(2, 2), dtype=float32)
    >>> ## PYTORCH CODE
    >>> print(pt_predictions)
    tensor([[2.2043e-04, 9.9978e-01],
@@ -285,6 +290,12 @@ training loop. 🤗 Transformers also provides a :class:`~transformers.Trainer`
 you are using TensorFlow) class to help with your training (taking care of things such as distributed training, mixed
 precision, etc.). See the :doc:`training tutorial <training>` for more details.

+.. note::
+
+    Pytorch model outputs are special dataclasses so that you can get autocompletion for their attributes in an IDE.
+    They also behave like a tuple or a dictionary (e.g., you can index with an integer, a slice or a string) in which
+    case the attributes not set (that have :obj:`None` values) are ignored.
+
 Once your model is fine-tuned, you can save it with its tokenizer in the following way:

 ::