Add MusicGen Melody (#28819)

* first modeling code

* make repository

* still WIP

* update model

* add tests

* add latest change

* clean docstrings and copied from

* update docstrings md and readme

* correct chroma function

* correct copied from and remove unreleated test

* add doc to toctree

* correct imports

* add convert script to notdoctested

* Add suggestion from Sanchit

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct get_uncoditional_inputs docstrings

* modify README according to SANCHIT feedback

* add chroma to audio utils

* clean librosa and torchaudio hard dependencies

* fix FE

* refactor audio decoder -> audio encoder for consistency with previous musicgen

* refactor conditional -> encoder

* modify sampling rate logics

* modify license at the beginning

* refactor all_self_attns->all_attentions

* remove ignore copy from causallm generate

* add copied from for from_sub_models

* fix make copies

* add warning if audio is truncated

* add copied from where relevant

* remove artefact

* fix convert script

* fix torchaudio and FE

* modify chroma method according to feedback-> better naming

* refactor input_values->input_features

* refactor input_values->input_features and fix import fe

* add input_features to docstrigs

* correct inputs_embeds logics

* remove dtype conversion

* refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation

* change warning for chroma length

* Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* change way to save wav, using soundfile

* correct docs and change to soundfile

* fix import

* fix init proj layers

* remove line breaks from md

* fix issue with docstrings

* add FE suggestions

* improve is in logics and remove useless imports

* remove custom from_pretrained

* simplify docstring code

* add suggestions for modeling tests

* make style

* update converting script with sanity check

* remove encoder attention mask from conditional generation

* replace musicgen melody checkpoints with official orga

* rename ylacombe->facebook in checkpoints

* fix copies

* remove unecessary warning

* add shape in code docstrings

* add files to slow doc tests

* fix md bug and add md to not_tested

* make fix-copies

* fix hidden states test and batching

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
This commit is contained in:
Yoach Lacombe
2024-03-18 13:06:12 +00:00
committed by GitHub
parent bf3dfd1160
commit c43b380e70
40 changed files with 5947 additions and 3 deletions

View File

@@ -395,6 +395,7 @@ OBJECTS_TO_IGNORE = [
"MraConfig",
"MusicgenDecoderConfig",
"MusicgenForConditionalGeneration",
"MusicgenMelodyForConditionalGeneration",
"MvpConfig",
"MvpTokenizerFast",
"MT5Tokenizer",

View File

@@ -294,6 +294,7 @@ IGNORE_NON_AUTO_CONFIGURED = PRIVATE_MODELS.copy() + [
"BarkCoarseModel",
"BarkFineModel",
"BarkSemanticModel",
"MusicgenMelodyModel",
"MusicgenModel",
"MusicgenForConditionalGeneration",
"SpeechT5ForSpeechToSpeech",

View File

@@ -176,6 +176,7 @@ docs/source/en/model_doc/mpt.md
docs/source/en/model_doc/mra.md
docs/source/en/model_doc/mt5.md
docs/source/en/model_doc/musicgen.md
docs/source/en/model_doc/musicgen_melody.md
docs/source/en/model_doc/mvp.md
docs/source/en/model_doc/nat.md
docs/source/en/model_doc/nezha.md
@@ -706,6 +707,7 @@ src/transformers/models/mt5/modeling_flax_mt5.py
src/transformers/models/mt5/modeling_mt5.py
src/transformers/models/mt5/modeling_tf_mt5.py
src/transformers/models/musicgen/convert_musicgen_transformers.py
src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
src/transformers/models/mvp/modeling_mvp.py
src/transformers/models/nezha/modeling_nezha.py
src/transformers/models/nllb_moe/configuration_nllb_moe.py

View File

@@ -8,4 +8,6 @@ docs/source/en/tasks/prompting.md
src/transformers/models/blip_2/modeling_blip_2.py
src/transformers/models/ctrl/modeling_ctrl.py
src/transformers/models/fuyu/modeling_fuyu.py
src/transformers/models/kosmos2/modeling_kosmos2.py
src/transformers/models/kosmos2/modeling_kosmos2.py
src/transformers/models/musicgen_melody/modeling_musicgen_melody.py
src/transformers/models/musicgen_melody/processing_musicgen_melody.py