Replace as_target context managers by direct calls (#18325)

* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py

Co-authored-by: amyeroberts <amy@huggingface.co>

* Style

Co-authored-by: amyeroberts <amy@huggingface.co>
This commit is contained in:
Sylvain Gugger
2022-07-29 08:09:09 -04:00
committed by GitHub
parent a64bcb564d
commit 986526a0e4
80 changed files with 725 additions and 550 deletions

View File

@@ -486,10 +486,8 @@ A processor combines a feature extractor and tokenizer. Load a processor with [`
>>> def prepare_dataset(example):
... audio = example["audio"]
... example["input_values"] = processor(audio["array"], sampling_rate=16000)
... example.update(processor(audio=audio["array"], text=example["text"], sampling_rate=16000))
... with processor.as_target_processor():
... example["labels"] = processor(example["text"]).input_ids
... return example
```