Documentation about loading a fast tokenizer within Transformers (#11029)

* Documentation about loading a fast tokenizer within Transformers

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
Lysandre Debut
2021-04-05 10:51:16 -04:00
committed by GitHub
parent 6c25f5228e
commit 9f4e0c23d6
5 changed files with 111 additions and 7 deletions

View File

@@ -62,6 +62,11 @@ PreTrainedTokenizer
PreTrainedTokenizerFast
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The :class:`~transformers.PreTrainedTokenizerFast` depend on the `tokenizers
<https://huggingface.co/docs/tokenizers>`__ library. The tokenizers obtained from the 🤗 tokenizers library can be
loaded very simply into 🤗 transformers. Take a look at the :doc:`Using tokenizers from 🤗 tokenizers
<../fast_tokenizers>` page to understand how this is done.
.. autoclass:: transformers.PreTrainedTokenizerFast
:special-members: __call__
:members: batch_decode, convert_ids_to_tokens, convert_tokens_to_ids, convert_tokens_to_string, decode, encode,