Fix some doctests after PR 15775 (#20036)

* Add skip_special_tokens=True in some doctest

* For T5

* Fix for speech_to_text.mdx

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This commit is contained in:
Yih-Dar
2022-11-03 14:18:45 +01:00
committed by GitHub
parent a639ea9e8a
commit 9ccea7acb1
4 changed files with 6 additions and 6 deletions

View File

@@ -57,7 +57,7 @@ be installed as follows: `apt install libsndfile1-dev`
>>> inputs = processor(ds[0]["audio"]["array"], sampling_rate=ds[0]["audio"]["sampling_rate"], return_tensors="pt")
>>> generated_ids = model.generate(inputs["input_features"], attention_mask=inputs["attention_mask"])
>>> transcription = processor.batch_decode(generated_ids)
>>> transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
>>> transcription
['mister quilter is the apostle of the middle classes and we are glad to welcome his gospel']
```
@@ -87,9 +87,9 @@ be installed as follows: `apt install libsndfile1-dev`
... forced_bos_token_id=processor.tokenizer.lang_code_to_id["fr"],
... )
>>> translation = processor.batch_decode(generated_ids)
>>> translation = processor.batch_decode(generated_ids, skip_special_tokens=True)
>>> translation
["<lang:fr> (Vidéo) Si M. Kilder est l'apossible des classes moyennes, et nous sommes heureux d'être accueillis dans son évangile."]
["(Vidéo) Si M. Kilder est l'apossible des classes moyennes, et nous sommes heureux d'être accueillis dans son évangile."]
```
See the [model hub](https://huggingface.co/models?filter=speech_to_text) to look for Speech2Text checkpoints.

View File

@@ -285,7 +285,7 @@ The predicted tokens will then be placed between the sentinel tokens.
>>> sequence_ids = model.generate(input_ids)
>>> sequences = tokenizer.batch_decode(sequence_ids)
>>> sequences
['<pad> <extra_id_0> park offers<extra_id_1> the<extra_id_2> park.</s>']
['<pad><extra_id_0> park offers<extra_id_1> the<extra_id_2> park.</s>']
```