[Whisper] Add large-v3 version support (#27336)
* Enable large-v3 downloading and update language list * Fix type annotation * make fixup * Export Whisper feature extractor * Fix error after extractor loading * Do not use pre-computed mel filters * Save the full preprocessor properly * Update docs * Remove comment Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add alignment heads consistent with each Whisper version * Remove alignment heads calculation * Save fast tokenizer format as well * Fix slow to fast conversion * Fix bos/eos/pad token IDs in the model config * Add decoder_start_token_id to config --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
93f2de858b
commit
87e217d065
@@ -34,13 +34,13 @@ The original code can be found [here](https://github.com/openai/whisper).
|
||||
- Inference is currently only implemented for short-form i.e. audio is pre-segmented into <=30s segments. Long-form (including timestamps) will be implemented in a future release.
|
||||
- One can use [`WhisperProcessor`] to prepare audio for the model, and decode the predicted ID's back into text.
|
||||
|
||||
- To convert the tokenizer, we recommend using the following:
|
||||
- To convert the model and the processor, we recommend using the following:
|
||||
|
||||
```bash
|
||||
python src/transformers/models/whisper/convert_openai_to_hf.py --checkpoint_path "" --pytorch_dump_folder_path "Arthur/whisper-3" --convert_tokenizer True --whisper_version 3 --multilingual True
|
||||
python src/transformers/models/whisper/convert_openai_to_hf.py --checkpoint_path "" --pytorch_dump_folder_path "Arthur/whisper-3" --convert_preprocessor True
|
||||
```
|
||||
Here the `whisper_version` will set the number of languages to `100` to account for `cantonese` which was added in `whisper-large-v3`.
|
||||
|
||||
The script will automatically determine all necessary parameters from the OpenAI checkpoint. A `tiktoken` library needs to be installed
|
||||
to perform the conversion of the OpenAI tokenizer to the `tokenizers` version.
|
||||
|
||||
## Inference
|
||||
|
||||
|
||||
Reference in New Issue
Block a user