Tokenizer kwargs in textgeneration pipe (#28362)
* added args to the pipeline * added test * more sensical tests * fixup * docs * typo ; * docs * made changes to support named args * fixed test * docs update * styles * docs * docs
This commit is contained in:
@@ -216,6 +216,12 @@ array([[1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
|
||||
</tf>
|
||||
</frameworkcontent>
|
||||
|
||||
<Tip>
|
||||
Different pipelines support tokenizer arguments in their `__call__()` differently. `text-2-text-generation` pipelines support (i.e. pass on)
|
||||
only `truncation`. `text-generation` pipelines support `max_length`, `truncation`, `padding` and `add_special_tokens`.
|
||||
In `fill-mask` pipelines, tokenizer arguments can be passed in the `tokenizer_kwargs` argument (dictionary).
|
||||
</Tip>
|
||||
|
||||
## Audio
|
||||
|
||||
For audio tasks, you'll need a [feature extractor](main_classes/feature_extractor) to prepare your dataset for the model. The feature extractor is designed to extract features from raw audio data, and convert them into tensors.
|
||||
|
||||
Reference in New Issue
Block a user