Add chat support to text generation pipeline (#28945)

* Add chat support to text generation pipeline * Better handling of single elements * Deprecate ConversationalPipeline * stash commit * Add missing add_special_tokens kwarg * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline * Add ✨TF✨ tests * @require_tf * Add type hint * Add specific deprecation version * Remove unnecessary do_sample * Remove todo - the discrepancy has been resolved * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/pipelines/text_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-16 16:41:01 +00:00
parent 636b03244c
commit 2f1003be86
5 changed files with 181 additions and 35 deletions
--- a/docs/source/en/chat_templating.md
+++ b/docs/source/en/chat_templating.md
@@ -121,13 +121,15 @@ Arr, 'twas easy after all!

 ## Is there an automated pipeline for chat?

-Yes, there is: [`ConversationalPipeline`]. This pipeline is designed to make it easy to use chat models. Let's try
-the `Zephyr` example again, but this time using the pipeline:
+Yes, there is! Our text generation pipelines support chat inputs, which makes it easy to use chat models. In the past,
+we used to use a dedicated "ConversationalPipeline" class, but this has now been deprecated and its functionality
+has been merged into the [`TextGenerationPipeline`]. Let's try the `Zephyr` example again, but this time using 
+a pipeline:

 ```python
 from transformers import pipeline

-pipe = pipeline("conversational", "HuggingFaceH4/zephyr-7b-beta")
+pipe = pipeline("text-generation", "HuggingFaceH4/zephyr-7b-beta")
 messages = [
    {
        "role": "system",
@@ -135,17 +137,14 @@ messages = [
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
 ]
-print(pipe(messages))
+print(pipe(messages, max_new_tokens=128)[0]['generated_text'][-1])  # Print the assistant's response
 ```

 ```text
-Conversation id: 76d886a0-74bd-454e-9804-0467041a63dc
-system: You are a friendly chatbot who always responds in the style of a pirate
-user: How many helicopters can a human eat in one sitting?
-assistant: Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all.
+{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."}
 ```

-[`ConversationalPipeline`] will take care of all the details of tokenization and calling `apply_chat_template` for you -
+The pipeline will take care of all the details of tokenization and calling `apply_chat_template` for you -
 once the model has a chat template, all you need to do is initialize the pipeline and pass it the list of messages!

 ## What are "generation prompts"?
@@ -191,7 +190,7 @@ Can I ask a question?<|im_end|>
 Note that this time, we've added the tokens that indicate the start of a bot response. This ensures that when the model
 generates text it will write a bot response instead of doing something unexpected, like continuing the user's 
 message. Remember, chat models are still just language models - they're trained to continue text, and chat is just a 
-special kind of text to them! You need to guide them with the appropriate control tokens so they know what they're 
+special kind of text to them! You need to guide them with appropriate control tokens, so they know what they're 
 supposed to be doing.

 Not all models require generation prompts. Some models, like BlenderBot and LLaMA, don't have any
@@ -340,8 +339,8 @@ tokenizer.chat_template = template  # Set the new template
 tokenizer.push_to_hub("model_name")  # Upload your new template to the Hub!
 ```

-The method [`~PreTrainedTokenizer.apply_chat_template`] which uses your chat template is called by the [`ConversationalPipeline`] class, so 
-once you set the correct chat template, your model will automatically become compatible with [`ConversationalPipeline`].
+The method [`~PreTrainedTokenizer.apply_chat_template`] which uses your chat template is called by the [`TextGenerationPipeline`] class, so 
+once you set the correct chat template, your model will automatically become compatible with [`TextGenerationPipeline`].

 <Tip>
 If you're fine-tuning a model for chat, in addition to setting a chat template, you should probably add any new chat
@@ -356,7 +355,7 @@ template. This will ensure that text generation tools can correctly figure out w

 Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards 
 compatibility, we have retained this class-specific handling as default templates, also set at the class level. If a
-model does not have a chat template set, but there is a default template for its model class, the `ConversationalPipeline`
+model does not have a chat template set, but there is a default template for its model class, the `TextGenerationPipeline`
 class and methods like `apply_chat_template` will use the class template instead. You can find out what the default
 template for your tokenizer is by checking the `tokenizer.default_chat_template` attribute.

@@ -407,7 +406,7 @@ I'm doing great!<|im_end|>
 ```

 The "user", "system" and "assistant" roles are the standard for chat, and we recommend using them when it makes sense,
-particularly if you want your model to operate well with [`ConversationalPipeline`]. However, you are not limited
+particularly if you want your model to operate well with [`TextGenerationPipeline`]. However, you are not limited
 to these roles - templating is extremely flexible, and any string can be a role.

 ### I want to add some chat templates! How should I get started?
@@ -418,7 +417,7 @@ not the model owner - if you're using a model with an empty chat template, or on
 template, please open a [pull request](https://huggingface.co/docs/hub/repositories-pull-requests-discussions) to the model repository so that this attribute can be set properly!

 Once the attribute is set, that's it, you're done! `tokenizer.apply_chat_template` will now work correctly for that
-model, which means it is also automatically supported in places like `ConversationalPipeline`!
+model, which means it is also automatically supported in places like `TextGenerationPipeline`!

 By ensuring that models have this attribute, we can make sure that the whole community gets to use the full power of
 open-source models. Formatting mismatches have been haunting the field and silently harming performance for too long -