Add a section on writing tool templates to the chat template docs (#33924)
* Add a section on writing tool templates to the chat template docs * Small cleanups
This commit is contained in:
@@ -962,4 +962,129 @@ tokenizer.chat_template = open("template.jinja").read()
|
||||
|
||||
As an added bonus, when you write a long, multi-line template in a separate file, line numbers in that file will
|
||||
exactly correspond to line numbers in template parsing or execution errors. This will make it much easier to
|
||||
identify the source of issues.
|
||||
identify the source of issues.
|
||||
|
||||
### Writing templates for tools
|
||||
|
||||
Although chat templates do not enforce a specific API for tools (or for anything, really), we recommend
|
||||
template authors try to stick to a standard API where possible. The whole point of chat templates is to allow code
|
||||
to be transferable across models, so deviating from the standard tools API means users will have to write
|
||||
custom code to use tools with your model. Sometimes it's unavoidable, but often with clever templating you can
|
||||
make the standard API work!
|
||||
|
||||
Below, we'll list the elements of the standard API, and give tips on writing templates that will work well with it.
|
||||
|
||||
#### Tool definitions
|
||||
|
||||
Your template should expect that the variable `tools` will either be null (if no tools are passed), or is a list
|
||||
of JSON schema dicts. Our chat template methods allow users to pass tools as either JSON schema or Python functions, but when
|
||||
functions are passed, we automatically generate JSON schema and pass that to your template. As a result, the
|
||||
`tools` variable that your template receives will always be a list of JSON schema. Here is
|
||||
a sample tool JSON schema:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "multiply",
|
||||
"description": "A function that multiplies two numbers",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"a": {
|
||||
"type": "number",
|
||||
"description": "The first number to multiply"
|
||||
},
|
||||
"b": {
|
||||
"type": "number",
|
||||
"description": "The second number to multiply"
|
||||
}
|
||||
},
|
||||
"required": ["a", "b"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
And here is some example code for handling tools in your chat template. Remember, this is just an example for a
|
||||
specific format - your model will probably need different formatting!
|
||||
|
||||
```text
|
||||
{%- if tools %}
|
||||
{%- for tool in tools %}
|
||||
{{- '<tool>' + tool['function']['name'] + '\n' }}
|
||||
{%- for argument in tool['function']['parameters']['properties'] %}
|
||||
{{- argument + ': ' + tool['function']['parameters']['properties'][argument]['description'] + '\n' }}
|
||||
{%- endfor %}
|
||||
{{- '\n</tool>' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
```
|
||||
|
||||
The specific tokens and tool descriptions your template renders should of course be chosen to match the ones your model
|
||||
was trained with. There is no requirement that your **model** understands JSON schema input, only that your template can translate
|
||||
JSON schema into your model's format. For example, [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024)
|
||||
was trained with tools defined using Python function headers, but the Command-R tool template accepts JSON schema,
|
||||
converts types internally and renders the input tools as Python headers. You can do a lot with templates!
|
||||
|
||||
#### Tool calls
|
||||
|
||||
Tool calls, if present, will be a list attached to a message with the "assistant" role. Note that `tool_calls` is
|
||||
always a list, even though most tool-calling models only support single tool calls at a time, which means
|
||||
the list will usually only have a single element. Here is a sample message dict containing a tool call:
|
||||
|
||||
```json
|
||||
{
|
||||
"role": "assistant",
|
||||
"tool_calls": [
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "multiply",
|
||||
"arguments": {
|
||||
"a": 5,
|
||||
"b": 6
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
And a common pattern for handling them would be something like this:
|
||||
|
||||
```text
|
||||
{%- if message['role'] == 'assistant' and 'tool_calls' in message %}
|
||||
{%- for tool_call in message['tool_calls'] %}
|
||||
{{- '<tool_call>' + tool_call['function']['name'] + '\n' + tool_call['function']['arguments']|tojson + '\n</tool_call>' }}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{%- endif %}
|
||||
```
|
||||
|
||||
Again, you should render the tool call with the formatting and special tokens that your model expects.
|
||||
|
||||
#### Tool responses
|
||||
|
||||
Tool responses have a simple format: They are a message dict with the "tool" role, a "name" key giving the name
|
||||
of the called function, and a "content" key containing the result of the tool call. Here is a sample tool response:
|
||||
|
||||
```json
|
||||
{
|
||||
"role": "tool",
|
||||
"name": "multiply",
|
||||
"content": "30"
|
||||
}
|
||||
```
|
||||
|
||||
You don't need to use all of the keys in the tool response. For example, if your model doesn't expect the function
|
||||
name to be included in the tool response, then rendering it can be as simple as:
|
||||
|
||||
```text
|
||||
{%- if message['role'] == 'tool' %}
|
||||
{{- "<tool_result>" + message['content'] + "</tool_result>" }}
|
||||
{%- endif %}
|
||||
```
|
||||
|
||||
Again, remember that the actual formatting and special tokens are model-specific - you should take a lot of care
|
||||
to ensure that tokens, whitespace and everything else exactly match the format your model was trained with!
|
||||
Reference in New Issue
Block a user