[Whisper] Add large-v3 version support (#27336)

* Enable large-v3 downloading and update language list * Fix type annotation * make fixup * Export Whisper feature extractor * Fix error after extractor loading * Do not use pre-computed mel filters * Save the full preprocessor properly * Update docs * Remove comment Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add alignment heads consistent with each Whisper version * Remove alignment heads calculation * Save fast tokenizer format as well * Fix slow to fast conversion * Fix bos/eos/pad token IDs in the model config * Add decoder_start_token_id to config --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-11-21 00:36:48 +08:00
parent 93f2de858b
commit 87e217d065
3 changed files with 100 additions and 29 deletions
--- a/docs/source/ko/model_doc/whisper.md
+++ b/docs/source/ko/model_doc/whisper.md
@@ -33,6 +33,14 @@ Whisper 모델은 Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine
 - 현재 추론은 짧은 형식에만 구현되어 있으며, 오디오는 30초 미만의 세그먼트로 미리 분할되어야 합니다. 타임스탬프를 포함한 긴 형식에 대한 추론은 향후 릴리스에서 구현될 예정입니다.
 - [`WhisperProcessor`]를 사용하여 모델에 사용할 오디오를 준비하고, 예측된 ID를 텍스트로 디코딩할 수 있습니다.

+- 모델과 프로세서를 변환하려면 다음을 사용하는 것이 좋습니다:
+
+```bash
+python src/transformers/models/whisper/convert_openai_to_hf.py --checkpoint_path "" --pytorch_dump_folder_path "Arthur/whisper-3" --convert_preprocessor True
+```
+스크립트는 OpenAI 체크포인트에서 필요한 모든 매개변수를 자동으로 결정합니다. OpenAI 변환을 수행하려면 `tiktoken` 라이브러리를 설치해야 합니다.
+라이브러리를 설치해야 OpenAI 토큰화기를 `tokenizers` 버전으로 변환할 수 있습니다.
+
 이 모델은 [Arthur Zucker](https://huggingface.co/ArthurZ)에 의해 제공되었습니다. 이 모델의 Tensorflow 버전은 [amyeroberts](https://huggingface.co/amyeroberts)에 의해 제공되었습니다.
 원본 코드는 [여기](https://github.com/openai/whisper)에서 찾을 수 있습니다.