Add Text-To-Speech pipeline (#24952)

* add AutoModelForTextToSpeech class

* add TTS pipeline and tessting

* add docstrings to text_to_speech pipeline

* fix torch dependency

* corrector 'processor is None' case in Pipeline

* correct repo id

* modify text-to-speech -> text-to-audio

* remove processor

* rename text_to_speech pipelines files to text_audio

* add textToWaveform and textToSpectrogram instead of textToAudio classes

* update TTS pipeline to the bare minimum

* update tests TTS pipeline

* make style and erase useless import torch in TTS pipeline tests

* modify how to check if generate or forward in TTS pipeline

* remove unnecessary extra new lines

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* refactor input_texts -> text_inputs

* correct docstrings of TTS.__call__

* correct the shape of generated waveform

* take care of Bark tokenizer special case

* correct run_pipeline_test TTS

* make style

* update TTS docstrings

* address Sylvain nit refactors

* make style

* refactor into one liners

* correct squeeze

* correct way to test if forward or generate

* Update output audio waveform shape

* make style

* correct import

* modify how the TTS pipeline test if a model can generate

* align shape output of TTS pipeline with consistent shape

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

This commit is contained in:

Yoach Lacombe

2023-08-17 18:34:47 +02:00

committed by

GitHub

parent c4c0ceff09

commit b8f69d0d10

11 changed files with 425 additions and 0 deletions

									
										7

docs/source/en/main_classes/pipelines.md
									
												View File
												
				@@ -318,6 +318,13 @@ Pipelines available for audio tasks include the following.

				    - __call__

				    - all

				### TextToAudioPipeline

				[[autodoc]] TextToAudioPipeline

				    - __call__

				    - all

				### ZeroShotAudioClassificationPipeline

				[[autodoc]] ZeroShotAudioClassificationPipeline

Add Text-To-Speech pipeline (#24952)

7 docs/source/en/main_classes/pipelines.md Unescape Escape View File

7

docs/source/en/main_classes/pipelines.md

View File