Support BatchNorm in Hubert pos_conv_emb as in fairseq (#34389)
* Support BatchNorm in Hubert pos_conv_emb as in fairseq * Correct the new defaults (#34377) * Correct the new defaults * CIs * add check * Update utils.py * Update utils.py * Add the max_length in generate test checking shape without passing length * style * CIs * fix fx CI issue * [auto. ping] Avoid sending empty info + add more team members (#34383) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix glm (#34388) * Fix duplicated * fix import * Use non nested images and batched text Idefics2/3 (#34222) * add support for non nested images and add tests * add tests error scenario * fix style * added single and no image to error tests * Fix onnx non-expotable inplace aten op (#34376) * fix onnx non-expotable inplace op * mistral, qwen2, qwen2_vl, starcoder2 * fixup copies * Fix right padding in LLaVA models (#34305) * fix right pad llavas * device mismatch * no filter (#34391) * no filter * no filter * no filter --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * SynthID: better example (#34372) * better example * Update src/transformers/generation/configuration_utils.py * Update src/transformers/generation/logits_process.py * nits * Tests: upgrade `test_eager_matches_sdpa_generate` (#34386) * Fix bnb training test failure (#34414) * Fix bnb training test: compatibility with OPTSdpaAttention * Avoid check expected exception when it is on CUDA (#34408) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix typos in agents_advanced.md (#34405) * [docs] Cache implementations (#34325) cache * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq Add conversion integration test, and make batchnorm explicit variable * Support BatchNorm in Hubert pos_conv_emb as in fairseq fix make fixup styling changes * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq Add conversion integration test, and make batchnorm explicit variable * Support BatchNorm in Hubert pos_conv_emb as in fairseq fix make fixup styling changes * [run-slow] hubert * [run-slow] hubert --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: Rudy Delouya <rudy.delouya@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
This commit is contained in:
@@ -943,3 +943,40 @@ class HubertModelIntegrationTest(unittest.TestCase):
|
||||
self.assertTrue(torch.allclose(outputs[:, :4, :4], expected_outputs_first, atol=5e-3))
|
||||
self.assertTrue(torch.allclose(outputs[:, -4:, -4:], expected_outputs_last, atol=5e-3))
|
||||
self.assertTrue(abs(outputs.sum() - expected_output_sum) < 0.1)
|
||||
|
||||
def test_inference_hubert_25hz(self):
|
||||
model = HubertModel.from_pretrained("slprl/mhubert-base-25hz").to(torch_device)
|
||||
|
||||
sample = self._load_datasamples(1)
|
||||
input_speech = torch.tensor(sample[0], dtype=torch.float, device=torch_device).unsqueeze(0)
|
||||
|
||||
with torch.no_grad():
|
||||
outputs = model(input_speech, output_hidden_states=True).hidden_states[11]
|
||||
|
||||
# expected outputs taken from the original textlesslib implementation by:
|
||||
# model = SpeechEncoder.by_name(dense_model_name='mhubert-base-25hz', quantizer_model_name='kmeans',
|
||||
# vocab_size=500, deduplicate=False, need_f0=False)
|
||||
# model(wav)['dense']
|
||||
expected_outputs_first = torch.tensor(
|
||||
[
|
||||
[0.0267, 0.1776, -0.1706, -0.4559],
|
||||
[-0.2430, -0.2943, -0.1864, -0.1187],
|
||||
[-0.1812, -0.4239, -0.1916, -0.0858],
|
||||
[-0.1495, -0.4758, -0.4036, 0.0302],
|
||||
],
|
||||
device=torch_device,
|
||||
)
|
||||
expected_outputs_last = torch.tensor(
|
||||
[
|
||||
[0.3366, -0.2734, -0.1415, -0.3055],
|
||||
[0.2329, -0.3580, -0.1421, -0.3197],
|
||||
[0.1631, -0.4301, -0.1965, -0.2956],
|
||||
[0.3342, -0.2185, -0.2253, -0.2363],
|
||||
],
|
||||
device=torch_device,
|
||||
)
|
||||
expected_output_sum = 1681.7603
|
||||
|
||||
self.assertTrue(torch.allclose(outputs[:, :4, :4], expected_outputs_first, atol=5e-3))
|
||||
self.assertTrue(torch.allclose(outputs[:, -4:, -4:], expected_outputs_last, atol=5e-3))
|
||||
self.assertTrue(abs(outputs.sum() - expected_output_sum) < 0.1)
|
||||
|
||||
Reference in New Issue
Block a user