Add a section on writing tool templates to the chat template docs (#33924)

* Add a section on writing tool templates to the chat template docs * Small cleanups
2024-10-04 14:40:44 +01:00
parent 2e719e35fd
commit de4112e4d2
1 changed files with 126 additions and 1 deletions
--- a/docs/source/en/chat_templating.md
+++ b/docs/source/en/chat_templating.md
@@ -963,3 +963,128 @@ tokenizer.chat_template = open("template.jinja").read()
 As an added bonus, when you write a long, multi-line template in a separate file, line numbers in that file will
 exactly correspond to line numbers in template parsing or execution errors. This will make it much easier to
 identify the source of issues.
 ### Writing templates for tools
 Although chat templates do not enforce a specific API for tools (or for anything, really), we recommend 
 template authors try to stick to a standard API where possible. The whole point of chat templates is to allow code
 to be transferable across models, so deviating from the standard tools API means users will have to write
 custom code to use tools with your model. Sometimes it's unavoidable, but often with clever templating you can
 make the standard API work!
 Below, we'll list the elements of the standard API, and give tips on writing templates that will work well with it.
 #### Tool definitions
 Your template should expect that the variable `tools` will either be null (if no tools are passed), or is a list 
 of JSON schema dicts. Our chat template methods allow users to pass tools as either JSON schema or Python functions, but when
 functions are passed, we automatically generate JSON schema and pass that to your template. As a result, the 
 `tools` variable that your template receives will always be a list of JSON schema. Here is
 a sample tool JSON schema:
 ```json
 {
  "type": "function", 
  "function": {
    "name": "multiply", 
    "description": "A function that multiplies two numbers", 
    "parameters": {
      "type": "object", 
      "properties": {
        "a": {
          "type": "number", 
          "description": "The first number to multiply"
        }, 
        "b": {
          "type": "number",
          "description": "The second number to multiply"
        }
      }, 
      "required": ["a", "b"]
    }
  }
 }
 ```
 And here is some example code for handling tools in your chat template. Remember, this is just an example for a
 specific format - your model will probably need different formatting!
 ```text
 {%- if tools %}
    {%- for tool in tools %}
        {{- '<tool>' + tool['function']['name'] + '\n' }}
        {%- for argument in tool['function']['parameters']['properties'] %}
            {{- argument + ': ' + tool['function']['parameters']['properties'][argument]['description'] + '\n' }}
        {%- endfor %}
        {{- '\n</tool>' }}
    {%- endif %}
 {%- endif %}
 ```
 The specific tokens and tool descriptions your template renders should of course be chosen to match the ones your model
 was trained with. There is no requirement that your **model** understands JSON schema input, only that your template can translate
 JSON schema into your model's format. For example, [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024) 
 was trained with tools defined using Python function headers, but the Command-R tool template accepts JSON schema, 
 converts types internally and renders the input tools as Python headers. You can do a lot with templates!
 #### Tool calls
 Tool calls, if present, will be a list attached to a message with the "assistant" role. Note that `tool_calls` is 
 always a list, even though most tool-calling models only support single tool calls at a time, which means
 the list will usually only have a single element. Here is a sample message dict containing a tool call:
 ```json
 {
  "role": "assistant",
  "tool_calls": [
    {
      "type": "function",
      "function": {
        "name": "multiply",
        "arguments": {
          "a": 5,
          "b": 6
        }
      }
    }
  ]
 }
 ```
 And a common pattern for handling them would be something like this:
 ```text
 {%- if message['role'] == 'assistant' and 'tool_calls' in message %}
    {%- for tool_call in message['tool_calls'] %}
            {{- '<tool_call>' + tool_call['function']['name'] + '\n' + tool_call['function']['arguments']|tojson + '\n</tool_call>' }}
        {%- endif %}
    {%- endfor %}
 {%- endif %}
 ```
 Again, you should render the tool call with the formatting and special tokens that your model expects.
 #### Tool responses
 Tool responses have a simple format: They are a message dict with the "tool" role, a "name" key giving the name
 of the called function, and a "content" key containing the result of the tool call. Here is a sample tool response:
 ```json
 {
  "role": "tool",
  "name": "multiply",
  "content": "30"
 }
 ```
 You don't need to use all of the keys in the tool response. For example, if your model doesn't expect the function
 name to be included in the tool response, then rendering it can be as simple as:
 ```text
 {%- if message['role'] == 'tool' %}
    {{- "<tool_result>" + message['content'] + "</tool_result>" }}
 {%- endif %}
 ```
 Again, remember that the actual formatting and special tokens are model-specific - you should take a lot of care
 to ensure that tokens, whitespace and everything else exactly match the format your model was trained with!