Files
HuggingFace_transformer/tests/models
Matthijs Hollemans cd927a4736 add word-level timestamps to Whisper (#23205)
* let's go!

* initial implementation of token-level timestamps

* only return a single timestamp per token

* remove token probabilities

* fix return type

* fix doc comment

* strip special tokens

* rename

* revert to not stripping special tokens

* only support models that have alignment_heads

* add integration test

* consistently name it token-level timestamps

* small DTW tweak

* initial support for ASR pipeline

* fix pipeline doc comments

* resolve token timestamps in pipeline with chunking

* change warning when no final timestamp is found

* return word-level timestamps

* fixup

* fix bug that skipped final word in each chunk

* fix failing unit tests

* merge punctuations into the words

* also return word tokens

* also return token indices

* add (failing) unit test for combine_tokens_into_words

* make combine_tokens_into_words private

* restore OpenAI's punctuation rules

* add pipeline tests

* make requested changes

* PR review changes

* fix failing pipeline test

* small stuff from PR

* only return words and their timestamps, not segments

* move alignment_heads into generation config

* forgot to set alignment_heads in pipeline tests

* tiny comment fix

* grr
2023-06-21 17:48:21 +02:00
..
2023-06-06 14:31:14 -04:00
2023-06-16 15:40:49 +01:00
2023-04-06 13:50:15 +01:00
2023-06-16 15:40:49 +01:00
2023-06-16 16:38:23 +02:00
2023-05-24 13:52:52 +01:00
2023-06-16 16:38:23 +02:00
2023-06-16 15:40:49 +01:00
2023-06-16 15:40:49 +01:00
2023-05-24 13:52:52 +01:00
2023-04-06 13:50:15 +01:00
2022-05-03 14:42:02 +02:00
2023-05-24 13:52:52 +01:00
2023-05-24 13:52:52 +01:00
2023-04-06 17:56:06 +02:00
2023-06-16 15:40:49 +01:00
2023-06-20 19:19:19 +02:00
2023-06-16 15:40:49 +01:00
2023-05-24 13:52:52 +01:00
2023-03-07 04:20:14 +01:00
2023-06-16 15:40:49 +01:00
2023-06-16 16:38:23 +02:00
2023-06-20 12:59:21 +01:00
2023-04-06 17:56:06 +02:00
2023-06-16 15:40:49 +01:00
2023-04-06 17:56:06 +02:00
2023-06-16 15:40:49 +01:00
2023-04-07 17:13:04 +02:00
2023-04-06 17:56:06 +02:00
2022-05-12 16:25:55 -04:00
2023-06-16 15:40:49 +01:00
2023-05-24 13:52:52 +01:00
2023-05-24 13:52:52 +01:00
2023-04-06 13:50:15 +01:00
2023-04-04 14:53:06 +02:00
2023-06-16 15:40:49 +01:00
2023-06-16 15:40:49 +01:00
2023-05-31 14:59:30 +01:00
2023-05-24 13:52:52 +01:00
2023-06-12 18:14:15 +02:00
2023-06-16 15:40:49 +01:00
2023-06-16 15:40:49 +01:00
2023-06-16 15:40:49 +01:00
2023-06-16 16:38:23 +02:00
2023-05-24 13:52:52 +01:00
2023-05-24 13:52:52 +01:00
2023-06-20 12:59:21 +01:00
2022-05-03 14:42:02 +02:00