[docs] fix xref to PreTrainedModel.generate (#11049)
* fix xref to generate * do the same for search methods * style * style
This commit is contained in:
@@ -13,19 +13,21 @@
|
|||||||
Utilities for Generation
|
Utilities for Generation
|
||||||
-----------------------------------------------------------------------------------------------------------------------
|
-----------------------------------------------------------------------------------------------------------------------
|
||||||
|
|
||||||
This page lists all the utility functions used by :meth:`~transformers.PreTrainedModel.generate`,
|
This page lists all the utility functions used by :meth:`~transformers.generation_utils.GenerationMixin.generate`,
|
||||||
:meth:`~transformers.PreTrainedModel.greedy_search`, :meth:`~transformers.PreTrainedModel.sample`,
|
:meth:`~transformers.generation_utils.GenerationMixin.greedy_search`,
|
||||||
:meth:`~transformers.PreTrainedModel.beam_search`, :meth:`~transformers.PreTrainedModel.beam_sample`, and
|
:meth:`~transformers.generation_utils.GenerationMixin.sample`,
|
||||||
:meth:`~transformers.PreTrainedModel.group_beam_search`.
|
:meth:`~transformers.generation_utils.GenerationMixin.beam_search`,
|
||||||
|
:meth:`~transformers.generation_utils.GenerationMixin.beam_sample`, and
|
||||||
|
:meth:`~transformers.generation_utils.GenerationMixin.group_beam_search`.
|
||||||
|
|
||||||
Most of those are only useful if you are studying the code of the generate methods in the library.
|
Most of those are only useful if you are studying the code of the generate methods in the library.
|
||||||
|
|
||||||
Generate Outputs
|
Generate Outputs
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The output of :meth:`~transformers.PreTrainedModel.generate` is an instance of a subclass of
|
The output of :meth:`~transformers.generation_utils.GenerationMixin.generate` is an instance of a subclass of
|
||||||
:class:`~transformers.file_utils.ModelOutput`. This output is a data structure containing all the information returned
|
:class:`~transformers.file_utils.ModelOutput`. This output is a data structure containing all the information returned
|
||||||
by :meth:`~transformers.PreTrainedModel.generate`, but that can also be used as tuple or dictionary.
|
by :meth:`~transformers.generation_utils.GenerationMixin.generate`, but that can also be used as tuple or dictionary.
|
||||||
|
|
||||||
Here's an example:
|
Here's an example:
|
||||||
|
|
||||||
|
|||||||
@@ -61,7 +61,7 @@ Implementation Notes
|
|||||||
- Model predictions are intended to be identical to the original implementation when
|
- Model predictions are intended to be identical to the original implementation when
|
||||||
:obj:`force_bos_token_to_be_generated=True`. This only works, however, if the string you pass to
|
:obj:`force_bos_token_to_be_generated=True`. This only works, however, if the string you pass to
|
||||||
:func:`fairseq.encode` starts with a space.
|
:func:`fairseq.encode` starts with a space.
|
||||||
- :meth:`~transformers.BartForConditionalGeneration.generate` should be used for conditional generation tasks like
|
- :meth:`~transformers.generation_utils.GenerationMixin.generate` should be used for conditional generation tasks like
|
||||||
summarization, see the example in that docstrings.
|
summarization, see the example in that docstrings.
|
||||||
- Models that load the `facebook/bart-large-cnn` weights will not have a :obj:`mask_token_id`, or be able to perform
|
- Models that load the `facebook/bart-large-cnn` weights will not have a :obj:`mask_token_id`, or be able to perform
|
||||||
mask-filling tasks.
|
mask-filling tasks.
|
||||||
|
|||||||
@@ -44,9 +44,9 @@ Tips:
|
|||||||
|
|
||||||
For more information about which prefix to use, it is easiest to look into Appendix D of the `paper
|
For more information about which prefix to use, it is easiest to look into Appendix D of the `paper
|
||||||
<https://arxiv.org/pdf/1910.10683.pdf>`__. - For sequence-to-sequence generation, it is recommended to use
|
<https://arxiv.org/pdf/1910.10683.pdf>`__. - For sequence-to-sequence generation, it is recommended to use
|
||||||
:obj:`T5ForConditionalGeneration.generate()`. This method takes care of feeding the encoded input via cross-attention
|
:meth:`~transformers.generation_utils.GenerationMixin.generate`. This method takes care of feeding the encoded input
|
||||||
layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative scalar embeddings.
|
via cross-attention layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative
|
||||||
Encoder input padding can be done on the left and on the right.
|
scalar embeddings. Encoder input padding can be done on the left and on the right.
|
||||||
|
|
||||||
This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
|
This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
|
||||||
<https://github.com/google-research/text-to-text-transfer-transformer>`__.
|
<https://github.com/google-research/text-to-text-transfer-transformer>`__.
|
||||||
|
|||||||
@@ -505,8 +505,8 @@ This outputs a (hopefully) coherent next token following the original sequence,
|
|||||||
>>> print(resulting_string)
|
>>> print(resulting_string)
|
||||||
Hugging Face is based in DUMBO, New York City, and has
|
Hugging Face is based in DUMBO, New York City, and has
|
||||||
|
|
||||||
In the next section, we show how :func:`~transformers.PreTrainedModel.generate` can be used to generate multiple tokens
|
In the next section, we show how :func:`~transformers.generation_utils.GenerationMixin.generate` can be used to
|
||||||
up to a specified length instead of one token at a time.
|
generate multiple tokens up to a specified length instead of one token at a time.
|
||||||
|
|
||||||
Text Generation
|
Text Generation
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|||||||
@@ -906,8 +906,9 @@ class RagSequenceForGeneration(RagPreTrainedModel):
|
|||||||
**model_kwargs
|
**model_kwargs
|
||||||
):
|
):
|
||||||
"""
|
"""
|
||||||
Implements RAG sequence "thorough" decoding. Read the :meth:`~transformers.PreTrainedModel.generate``
|
Implements RAG sequence "thorough" decoding. Read the
|
||||||
documentation for more information on how to set other generate input parameters.
|
:meth:`~transformers.generation_utils.GenerationMixin.generate`` documentation for more information on how to
|
||||||
|
set other generate input parameters.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
|
input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
|
||||||
@@ -942,14 +943,15 @@ class RagSequenceForGeneration(RagPreTrainedModel):
|
|||||||
to be set to :obj:`False` if used while training with distributed backend.
|
to be set to :obj:`False` if used while training with distributed backend.
|
||||||
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
||||||
The number of independently computed returned sequences for each element in the batch. Note that this
|
The number of independently computed returned sequences for each element in the batch. Note that this
|
||||||
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate``
|
is not the value we pass to the ``generator``'s
|
||||||
function, where we set ``num_return_sequences`` to :obj:`num_beams`.
|
`:func:`~transformers.generation_utils.GenerationMixin.generate`` function, where we set
|
||||||
|
``num_return_sequences`` to :obj:`num_beams`.
|
||||||
num_beams (:obj:`int`, `optional`, defaults to 1):
|
num_beams (:obj:`int`, `optional`, defaults to 1):
|
||||||
Number of beams for beam search. 1 means no beam search.
|
Number of beams for beam search. 1 means no beam search.
|
||||||
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
||||||
Number of documents to retrieve and/or number of documents for which to generate an answer.
|
Number of documents to retrieve and/or number of documents for which to generate an answer.
|
||||||
kwargs:
|
kwargs:
|
||||||
Additional kwargs will be passed to :meth:`~transformers.PreTrainedModel.generate`.
|
Additional kwargs will be passed to :meth:`~transformers.generation_utils.GenerationMixin.generate`.
|
||||||
|
|
||||||
Return:
|
Return:
|
||||||
:obj:`torch.LongTensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated
|
:obj:`torch.LongTensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated
|
||||||
@@ -1452,8 +1454,9 @@ class RagTokenForGeneration(RagPreTrainedModel):
|
|||||||
enabled.
|
enabled.
|
||||||
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
||||||
The number of independently computed returned sequences for each element in the batch. Note that this
|
The number of independently computed returned sequences for each element in the batch. Note that this
|
||||||
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate`
|
is not the value we pass to the ``generator``'s
|
||||||
function, where we set ``num_return_sequences`` to :obj:`num_beams`.
|
`:func:`~transformers.generation_utils.GenerationMixin.generate` function, where we set
|
||||||
|
``num_return_sequences`` to :obj:`num_beams`.
|
||||||
decoder_start_token_id (:obj:`int`, `optional`):
|
decoder_start_token_id (:obj:`int`, `optional`):
|
||||||
If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token.
|
If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token.
|
||||||
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
||||||
|
|||||||
@@ -1130,8 +1130,9 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss
|
|||||||
Number of beams for beam search. 1 means no beam search.
|
Number of beams for beam search. 1 means no beam search.
|
||||||
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
||||||
The number of independently computed returned sequences for each element in the batch. Note that this
|
The number of independently computed returned sequences for each element in the batch. Note that this
|
||||||
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate`
|
is not the value we pass to the ``generator``'s
|
||||||
function, where we set ``num_return_sequences`` to :obj:`num_beams`.
|
`:func:`~transformers.generation_utils.GenerationMixin.generate` function, where we set
|
||||||
|
``num_return_sequences`` to :obj:`num_beams`.
|
||||||
decoder_start_token_id (:obj:`int`, `optional`):
|
decoder_start_token_id (:obj:`int`, `optional`):
|
||||||
If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token.
|
If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token.
|
||||||
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
||||||
@@ -1682,8 +1683,9 @@ class TFRagSequenceForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingL
|
|||||||
**model_kwargs
|
**model_kwargs
|
||||||
):
|
):
|
||||||
"""
|
"""
|
||||||
Implements RAG sequence "thorough" decoding. Read the :meth:`~transformers.PreTrainedModel.generate``
|
Implements RAG sequence "thorough" decoding. Read the
|
||||||
documentation for more information on how to set other generate input parameters
|
:meth:`~transformers.generation_utils.GenerationMixin.generate`` documentation for more information on how to
|
||||||
|
set other generate input parameters
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
input_ids (:obj:`tf.Tensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
|
input_ids (:obj:`tf.Tensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
|
||||||
@@ -1711,14 +1713,15 @@ class TFRagSequenceForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingL
|
|||||||
to be set to :obj:`False` if used while training with distributed backend.
|
to be set to :obj:`False` if used while training with distributed backend.
|
||||||
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
num_return_sequences(:obj:`int`, `optional`, defaults to 1):
|
||||||
The number of independently computed returned sequences for each element in the batch. Note that this
|
The number of independently computed returned sequences for each element in the batch. Note that this
|
||||||
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate``
|
is not the value we pass to the ``generator``'s
|
||||||
function, where we set ``num_return_sequences`` to :obj:`num_beams`.
|
`:func:`~transformers.generation_utils.GenerationMixin.generate`` function, where we set
|
||||||
|
``num_return_sequences`` to :obj:`num_beams`.
|
||||||
num_beams (:obj:`int`, `optional`, defaults to 1):
|
num_beams (:obj:`int`, `optional`, defaults to 1):
|
||||||
Number of beams for beam search. 1 means no beam search.
|
Number of beams for beam search. 1 means no beam search.
|
||||||
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
|
||||||
Number of documents to retrieve and/or number of documents for which to generate an answer.
|
Number of documents to retrieve and/or number of documents for which to generate an answer.
|
||||||
kwargs:
|
kwargs:
|
||||||
Additional kwargs will be passed to :meth:`~transformers.PreTrainedModel.generate`
|
Additional kwargs will be passed to :meth:`~transformers.generation_utils.GenerationMixin.generate`
|
||||||
|
|
||||||
Return:
|
Return:
|
||||||
:obj:`tf.Tensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated
|
:obj:`tf.Tensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated
|
||||||
|
|||||||
Reference in New Issue
Block a user