Make "tool_use" the default chat template key when tools are passed (#31429)

* Make "tool_use" the default when tools are passed

* Add some opinionated text to the docs

* Add some opinionated text to the docs
This commit is contained in:
Matt
2024-06-18 13:54:42 +01:00
committed by GitHub
parent cd71f9381b
commit dabf01973a
2 changed files with 30 additions and 8 deletions

View File

@@ -677,6 +677,24 @@ template. This will ensure that text generation tools can correctly figure out w
</Tip> </Tip>
### Why do some models have multiple templates?
Some models use different templates for different use cases. For example, they might use one template for normal chat
and another for tool-use, or retrieval-augmented generation. In these cases, `tokenizer.chat_template` is a dictionary.
This can cause some confusion, and where possible, we recommend using a single template for all use-cases. You can use
Jinja statements like `if tools is defined` and `{% macro %}` definitions to easily wrap multiple code paths in a
single template.
When a tokenizer has multiple templates, `tokenizer.chat_template` will be a `dict`, where each key is the name
of a template. The `apply_chat_template` method has special handling for certain template names: Specifically, it will
look for a template named `default` in most cases, and will raise an error if it can't find one. However, if a template
named `tool_use` exists when the user has passed a `tools` argument, it will use that instead. To access templates
with other names, pass the name of the template you want to the `chat_template` argument of
`apply_chat_template()`.
We find that this can be a bit confusing for users, though - so if you're writing a template yourself, we recommend
trying to put it all in a single template where possible!
### What are "default" templates? ### What are "default" templates?
Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards

View File

@@ -1781,16 +1781,20 @@ class PreTrainedTokenizerBase(SpecialTokensMixin, PushToHubMixin):
chat_template = template_dict[chat_template] chat_template = template_dict[chat_template]
if using_default_dict: if using_default_dict:
using_default_template = True using_default_template = True
elif chat_template is None and "default" in template_dict: elif chat_template is None:
chat_template = template_dict["default"] if tools is not None and "tool_use" in template_dict:
chat_template = template_dict["tool_use"]
elif "default" in template_dict:
chat_template = template_dict["default"]
else:
raise ValueError(
"This model has multiple chat templates with no default specified! Please either pass a chat "
"template or the name of the template you wish to use to the `chat_template` argument. Available "
f"template names are {sorted(template_dict.keys())}."
)
if using_default_dict: if using_default_dict:
using_default_template = True using_default_template = True
elif chat_template is None:
raise ValueError(
"This model has multiple chat templates with no default specified! Please either pass a chat "
"template or the name of the template you wish to use to the `chat_template` argument. Available "
f"template names are {sorted(template_dict.keys())}."
)
elif chat_template is None: elif chat_template is None:
# These are the cases when the model has a single template # These are the cases when the model has a single template
# priority: `chat_template` argument > `tokenizer.chat_template` > `tokenizer.default_chat_template # priority: `chat_template` argument > `tokenizer.chat_template` > `tokenizer.default_chat_template