Refactor whisper asr pipeline to include language too. (#21427)
* [WIP] whisper refacto to support language output. * Handling merges. * A bit more cleanup and comments. * Many improvements. Lots of details everywhere. * Cleanup old code and tests. * Handle lone timestamp tokens (just recover when something bad happens). * Adding return_language example. * No ffmpeg. * Hmm. * Some corrections. * Both fast and slow. * New black. * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove print. * Undoing tests modifications. * Smaller test modifications. * Rename. * Remove maxDiff. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
This commit is contained in:
@@ -538,7 +538,7 @@ class AutomaticSpeechRecognitionPipelineTests(unittest.TestCase):
|
||||
"tight-loan cloth that was the only garment he wore, the "
|
||||
"cut"
|
||||
),
|
||||
"timestamp": (5.5, 11.94),
|
||||
"timestamp": (5.5, 11.95),
|
||||
},
|
||||
{
|
||||
"text": (
|
||||
@@ -546,15 +546,15 @@ class AutomaticSpeechRecognitionPipelineTests(unittest.TestCase):
|
||||
"overstrained eyes, even the soaring arena around him "
|
||||
"with"
|
||||
),
|
||||
"timestamp": (11.94, 19.6),
|
||||
"timestamp": (11.95, 19.61),
|
||||
},
|
||||
{
|
||||
"text": " the thousands of spectators, retrievality is not worth thinking about.",
|
||||
"timestamp": (19.6, 26.66),
|
||||
"timestamp": (19.61, 25.0),
|
||||
},
|
||||
{
|
||||
"text": " His instant panic was followed by a small, sharp blow high on his chest.",
|
||||
"timestamp": (26.66, 31.06),
|
||||
"timestamp": (25.0, 29.4),
|
||||
},
|
||||
],
|
||||
"text": (
|
||||
|
||||
Reference in New Issue
Block a user