Refactor whisper asr pipeline to include language too. (#21427)

* [WIP] whisper refacto to support language output.

* Handling merges.

* A bit more cleanup and comments.

* Many improvements.

Lots of details everywhere.

* Cleanup old code and tests.

* Handle lone timestamp tokens (just recover when something bad happens).

* Adding return_language example.

* No ffmpeg.

* Hmm.

* Some corrections.

* Both fast and slow.

* New black.

* Update src/transformers/models/whisper/tokenization_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/whisper/tokenization_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove print.

* Undoing tests modifications.

* Smaller test modifications.

* Rename.

* Remove maxDiff.

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
This commit is contained in:
Nicolas Patry
2023-03-02 18:12:19 +01:00
committed by GitHub
parent 8e5a1b2abb
commit 1325459105
5 changed files with 518 additions and 128 deletions

View File

@@ -538,7 +538,7 @@ class AutomaticSpeechRecognitionPipelineTests(unittest.TestCase):
"tight-loan cloth that was the only garment he wore, the "
"cut"
),
"timestamp": (5.5, 11.94),
"timestamp": (5.5, 11.95),
},
{
"text": (
@@ -546,15 +546,15 @@ class AutomaticSpeechRecognitionPipelineTests(unittest.TestCase):
"overstrained eyes, even the soaring arena around him "
"with"
),
"timestamp": (11.94, 19.6),
"timestamp": (11.95, 19.61),
},
{
"text": " the thousands of spectators, retrievality is not worth thinking about.",
"timestamp": (19.6, 26.66),
"timestamp": (19.61, 25.0),
},
{
"text": " His instant panic was followed by a small, sharp blow high on his chest.",
"timestamp": (26.66, 31.06),
"timestamp": (25.0, 29.4),
},
],
"text": (