🔴 Fix EnCodec internals and integration tests (#39431)

* EnCodec fixes and update integration tests.

* Apply padding mask when normalize is False.

* Update comment of copied function.

* Fix padding mask within modeling.

* Revert padding function.

* Simplify handling of padding_mask.

* Address variable codebook size.

* Add output for padding for consistency with original model, fix docstrings.

* last_frame_pad_length as int

* Update example code.

* Improve docstring/comments.

* Shorten expected output.

* Consistent docstring.

* Parameterize tests.

* Properties for derived variables.

* Update expected outputs from GitHub runner.

* Consistent outputs with runner GPUs.
This commit is contained in:
Eric Bezzam
2025-07-23 19:39:27 +02:00
committed by GitHub
parent 7a4e2e7868
commit c5a80dd6c4
4 changed files with 1184 additions and 233 deletions

View File

@@ -47,7 +47,8 @@ Here is a quick example of how to encode and decode an audio using this model:
>>> inputs = processor(raw_audio=audio_sample, sampling_rate=processor.sampling_rate, return_tensors="pt")
>>> encoder_outputs = model.encode(inputs["input_values"], inputs["padding_mask"])
>>> audio_values = model.decode(encoder_outputs.audio_codes, encoder_outputs.audio_scales, inputs["padding_mask"])[0]
>>> # `encoder_outputs.audio_codes` contains discrete codes
>>> audio_values = model.decode(**encoder_outputs, padding_mask=inputs["padding_mask"])[0]
>>> # or the equivalent with a forward pass
>>> audio_values = model(inputs["input_values"], inputs["padding_mask"]).audio_values
```