From de4112e4d20795b27bad0050e30f324a1a3a26f2 Mon Sep 17 00:00:00 2001 From: Matt Date: Fri, 4 Oct 2024 14:40:44 +0100 Subject: [PATCH] Add a section on writing tool templates to the chat template docs (#33924) * Add a section on writing tool templates to the chat template docs * Small cleanups --- docs/source/en/chat_templating.md | 127 +++++++++++++++++++++++++++++- 1 file changed, 126 insertions(+), 1 deletion(-) diff --git a/docs/source/en/chat_templating.md b/docs/source/en/chat_templating.md index 543d9fa00b..de3d056c91 100644 --- a/docs/source/en/chat_templating.md +++ b/docs/source/en/chat_templating.md @@ -962,4 +962,129 @@ tokenizer.chat_template = open("template.jinja").read() As an added bonus, when you write a long, multi-line template in a separate file, line numbers in that file will exactly correspond to line numbers in template parsing or execution errors. This will make it much easier to -identify the source of issues. \ No newline at end of file +identify the source of issues. + +### Writing templates for tools + +Although chat templates do not enforce a specific API for tools (or for anything, really), we recommend +template authors try to stick to a standard API where possible. The whole point of chat templates is to allow code +to be transferable across models, so deviating from the standard tools API means users will have to write +custom code to use tools with your model. Sometimes it's unavoidable, but often with clever templating you can +make the standard API work! + +Below, we'll list the elements of the standard API, and give tips on writing templates that will work well with it. + +#### Tool definitions + +Your template should expect that the variable `tools` will either be null (if no tools are passed), or is a list +of JSON schema dicts. Our chat template methods allow users to pass tools as either JSON schema or Python functions, but when +functions are passed, we automatically generate JSON schema and pass that to your template. As a result, the +`tools` variable that your template receives will always be a list of JSON schema. Here is +a sample tool JSON schema: + +```json +{ + "type": "function", + "function": { + "name": "multiply", + "description": "A function that multiplies two numbers", + "parameters": { + "type": "object", + "properties": { + "a": { + "type": "number", + "description": "The first number to multiply" + }, + "b": { + "type": "number", + "description": "The second number to multiply" + } + }, + "required": ["a", "b"] + } + } +} +``` + +And here is some example code for handling tools in your chat template. Remember, this is just an example for a +specific format - your model will probably need different formatting! + +```text +{%- if tools %} + {%- for tool in tools %} + {{- '' + tool['function']['name'] + '\n' }} + {%- for argument in tool['function']['parameters']['properties'] %} + {{- argument + ': ' + tool['function']['parameters']['properties'][argument]['description'] + '\n' }} + {%- endfor %} + {{- '\n' }} + {%- endif %} +{%- endif %} +``` + +The specific tokens and tool descriptions your template renders should of course be chosen to match the ones your model +was trained with. There is no requirement that your **model** understands JSON schema input, only that your template can translate +JSON schema into your model's format. For example, [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024) +was trained with tools defined using Python function headers, but the Command-R tool template accepts JSON schema, +converts types internally and renders the input tools as Python headers. You can do a lot with templates! + +#### Tool calls + +Tool calls, if present, will be a list attached to a message with the "assistant" role. Note that `tool_calls` is +always a list, even though most tool-calling models only support single tool calls at a time, which means +the list will usually only have a single element. Here is a sample message dict containing a tool call: + +```json +{ + "role": "assistant", + "tool_calls": [ + { + "type": "function", + "function": { + "name": "multiply", + "arguments": { + "a": 5, + "b": 6 + } + } + } + ] +} +``` + +And a common pattern for handling them would be something like this: + +```text +{%- if message['role'] == 'assistant' and 'tool_calls' in message %} + {%- for tool_call in message['tool_calls'] %} + {{- '' + tool_call['function']['name'] + '\n' + tool_call['function']['arguments']|tojson + '\n' }} + {%- endif %} + {%- endfor %} +{%- endif %} +``` + +Again, you should render the tool call with the formatting and special tokens that your model expects. + +#### Tool responses + +Tool responses have a simple format: They are a message dict with the "tool" role, a "name" key giving the name +of the called function, and a "content" key containing the result of the tool call. Here is a sample tool response: + +```json +{ + "role": "tool", + "name": "multiply", + "content": "30" +} +``` + +You don't need to use all of the keys in the tool response. For example, if your model doesn't expect the function +name to be included in the tool response, then rendering it can be as simple as: + +```text +{%- if message['role'] == 'tool' %} + {{- "" + message['content'] + "" }} +{%- endif %} +``` + +Again, remember that the actual formatting and special tokens are model-specific - you should take a lot of care +to ensure that tokens, whitespace and everything else exactly match the format your model was trained with! \ No newline at end of file