Add chat support to text generation pipeline (#28945)
* Add chat support to text generation pipeline * Better handling of single elements * Deprecate ConversationalPipeline * stash commit * Add missing add_special_tokens kwarg * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline * Add ✨TF✨ tests * @require_tf * Add type hint * Add specific deprecation version * Remove unnecessary do_sample * Remove todo - the discrepancy has been resolved * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/pipelines/text_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
This commit is contained in:
@@ -121,13 +121,15 @@ Arr, 'twas easy after all!
|
||||
|
||||
## Is there an automated pipeline for chat?
|
||||
|
||||
Yes, there is: [`ConversationalPipeline`]. This pipeline is designed to make it easy to use chat models. Let's try
|
||||
the `Zephyr` example again, but this time using the pipeline:
|
||||
Yes, there is! Our text generation pipelines support chat inputs, which makes it easy to use chat models. In the past,
|
||||
we used to use a dedicated "ConversationalPipeline" class, but this has now been deprecated and its functionality
|
||||
has been merged into the [`TextGenerationPipeline`]. Let's try the `Zephyr` example again, but this time using
|
||||
a pipeline:
|
||||
|
||||
```python
|
||||
from transformers import pipeline
|
||||
|
||||
pipe = pipeline("conversational", "HuggingFaceH4/zephyr-7b-beta")
|
||||
pipe = pipeline("text-generation", "HuggingFaceH4/zephyr-7b-beta")
|
||||
messages = [
|
||||
{
|
||||
"role": "system",
|
||||
@@ -135,17 +137,14 @@ messages = [
|
||||
},
|
||||
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
|
||||
]
|
||||
print(pipe(messages))
|
||||
print(pipe(messages, max_new_tokens=128)[0]['generated_text'][-1]) # Print the assistant's response
|
||||
```
|
||||
|
||||
```text
|
||||
Conversation id: 76d886a0-74bd-454e-9804-0467041a63dc
|
||||
system: You are a friendly chatbot who always responds in the style of a pirate
|
||||
user: How many helicopters can a human eat in one sitting?
|
||||
assistant: Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all.
|
||||
{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."}
|
||||
```
|
||||
|
||||
[`ConversationalPipeline`] will take care of all the details of tokenization and calling `apply_chat_template` for you -
|
||||
The pipeline will take care of all the details of tokenization and calling `apply_chat_template` for you -
|
||||
once the model has a chat template, all you need to do is initialize the pipeline and pass it the list of messages!
|
||||
|
||||
## What are "generation prompts"?
|
||||
@@ -191,7 +190,7 @@ Can I ask a question?<|im_end|>
|
||||
Note that this time, we've added the tokens that indicate the start of a bot response. This ensures that when the model
|
||||
generates text it will write a bot response instead of doing something unexpected, like continuing the user's
|
||||
message. Remember, chat models are still just language models - they're trained to continue text, and chat is just a
|
||||
special kind of text to them! You need to guide them with the appropriate control tokens so they know what they're
|
||||
special kind of text to them! You need to guide them with appropriate control tokens, so they know what they're
|
||||
supposed to be doing.
|
||||
|
||||
Not all models require generation prompts. Some models, like BlenderBot and LLaMA, don't have any
|
||||
@@ -340,8 +339,8 @@ tokenizer.chat_template = template # Set the new template
|
||||
tokenizer.push_to_hub("model_name") # Upload your new template to the Hub!
|
||||
```
|
||||
|
||||
The method [`~PreTrainedTokenizer.apply_chat_template`] which uses your chat template is called by the [`ConversationalPipeline`] class, so
|
||||
once you set the correct chat template, your model will automatically become compatible with [`ConversationalPipeline`].
|
||||
The method [`~PreTrainedTokenizer.apply_chat_template`] which uses your chat template is called by the [`TextGenerationPipeline`] class, so
|
||||
once you set the correct chat template, your model will automatically become compatible with [`TextGenerationPipeline`].
|
||||
|
||||
<Tip>
|
||||
If you're fine-tuning a model for chat, in addition to setting a chat template, you should probably add any new chat
|
||||
@@ -356,7 +355,7 @@ template. This will ensure that text generation tools can correctly figure out w
|
||||
|
||||
Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards
|
||||
compatibility, we have retained this class-specific handling as default templates, also set at the class level. If a
|
||||
model does not have a chat template set, but there is a default template for its model class, the `ConversationalPipeline`
|
||||
model does not have a chat template set, but there is a default template for its model class, the `TextGenerationPipeline`
|
||||
class and methods like `apply_chat_template` will use the class template instead. You can find out what the default
|
||||
template for your tokenizer is by checking the `tokenizer.default_chat_template` attribute.
|
||||
|
||||
@@ -407,7 +406,7 @@ I'm doing great!<|im_end|>
|
||||
```
|
||||
|
||||
The "user", "system" and "assistant" roles are the standard for chat, and we recommend using them when it makes sense,
|
||||
particularly if you want your model to operate well with [`ConversationalPipeline`]. However, you are not limited
|
||||
particularly if you want your model to operate well with [`TextGenerationPipeline`]. However, you are not limited
|
||||
to these roles - templating is extremely flexible, and any string can be a role.
|
||||
|
||||
### I want to add some chat templates! How should I get started?
|
||||
@@ -418,7 +417,7 @@ not the model owner - if you're using a model with an empty chat template, or on
|
||||
template, please open a [pull request](https://huggingface.co/docs/hub/repositories-pull-requests-discussions) to the model repository so that this attribute can be set properly!
|
||||
|
||||
Once the attribute is set, that's it, you're done! `tokenizer.apply_chat_template` will now work correctly for that
|
||||
model, which means it is also automatically supported in places like `ConversationalPipeline`!
|
||||
model, which means it is also automatically supported in places like `TextGenerationPipeline`!
|
||||
|
||||
By ensuring that models have this attribute, we can make sure that the whole community gets to use the full power of
|
||||
open-source models. Formatting mismatches have been haunting the field and silently harming performance for too long -
|
||||
|
||||
Reference in New Issue
Block a user