Quentin Lhoest
1ecd52e50a
Add torchcodec in docstrings/tests for datasets 4.0 ( #39156 )
...
* fix dataset run_object_detection
* bump version
* keep same dataset actually
* torchcodec in docstrings and testing utils
* torchcodec in dockerfiles and requirements
* remove duplicate
* add torchocodec to all the remaining docker files
* fix tests
* support torchcodec in audio classification and ASR
* [commit to revert] build ci-dev images
* [commit to revert] trigger circleci
* [commit to revert] build ci-dev images
* fix
* fix modeling_hubert
* backward compatible run_object_detection
* revert ci trigger commits
* fix mono conversion and support torch tensor as input
* revert map_to_array docs + fix it
* revert mono
* nit in docstring
* style
* fix modular
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-07-08 17:06:12 +02:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests ( #37343 )
...
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-08 14:12:08 +02:00
Yih-Dar
adfc91cd46
Try to avoid/reduce some remaining CI job failures ( #37202 )
...
* try
* try
* Update tests/pipelines/test_pipelines_video_classification.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
2025-04-02 14:39:57 +02:00
eustlb
fb8e6c50e4
[audio utils] fix fft_bin_width computation ( #36603 )
...
* fix fft_bin_width computation
* update docstring + enforce correct params
* update test with correct value
* udpate test
* update feature extractors for concerned models
* update
* make
* udpate docstring
* udpate docstring
2025-03-27 15:20:02 +01:00
Sanchit Gandhi
f83c6f1d02
Remove trust_remote_code when loading Libri Dummy ( #31748 )
...
* [whisper integration] use parquet dataset for testing
* propagate to others
* more propagation
* last one
2024-07-23 14:54:38 +08:00
Zhiyong Wang
dce253f645
Add implementation of spectrogram_batch ( #27159 )
...
* Add initial implementation of `spectrogram_batch`
* Format the initial implementation
* Add test suite for the `spectrogram_batch`
* Update `spectrogram_batch` to ensure compatibility with test suite
* Update `spectrogram_batch` to include pre and post-processing
* Add `amplitude_to_db_batch` function and associated tests
* Add `power_to_db_batch` function and associated tests
* Reimplement the test suite for `spectrogram_batch`
* Fix errors in `spectrogram_batch`
* Add the function annotation for `spectrogram_batch`
* Address code quality
* Re-add `test_chroma_equivalence` function
* Update src/transformers/audio_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/audio_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-06-24 09:19:12 +02:00
Albert Villanova del Moral
a14b055b65
Pass datasets trust_remote_code ( #31406 )
...
* Pass datasets trust_remote_code
* Pass trust_remote_code in more tests
* Add trust_remote_dataset_code arg to some tests
* Revert "Temporarily pin datasets upper version to fix CI"
This reverts commit b7672826ca .
* Pass trust_remote_code in librispeech_asr_dummy docstrings
* Revert "Pin datasets<2.20.0 for examples"
This reverts commit 833fc17a3e .
* Pass trust_remote_code to all examples
* Revert "Add trust_remote_dataset_code arg to some tests" to research_projects
* Pass trust_remote_code to tests
* Pass trust_remote_code to docstrings
* Fix flax examples tests requirements
* Pass trust_remote_dataset_code arg to tests
* Replace trust_remote_dataset_code with trust_remote_code in one example
* Fix duplicate trust_remote_code
* Replace args.trust_remote_dataset_code with args.trust_remote_code
* Replace trust_remote_dataset_code with trust_remote_code in parser
* Replace trust_remote_dataset_code with trust_remote_code in dataclasses
* Replace trust_remote_dataset_code with trust_remote_code arg
2024-06-17 17:29:13 +01:00
Yoach Lacombe
c43b380e70
Add MusicGen Melody ( #28819 )
...
* first modeling code
* make repository
* still WIP
* update model
* add tests
* add latest change
* clean docstrings and copied from
* update docstrings md and readme
* correct chroma function
* correct copied from and remove unreleated test
* add doc to toctree
* correct imports
* add convert script to notdoctested
* Add suggestion from Sanchit
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* correct get_uncoditional_inputs docstrings
* modify README according to SANCHIT feedback
* add chroma to audio utils
* clean librosa and torchaudio hard dependencies
* fix FE
* refactor audio decoder -> audio encoder for consistency with previous musicgen
* refactor conditional -> encoder
* modify sampling rate logics
* modify license at the beginning
* refactor all_self_attns->all_attentions
* remove ignore copy from causallm generate
* add copied from for from_sub_models
* fix make copies
* add warning if audio is truncated
* add copied from where relevant
* remove artefact
* fix convert script
* fix torchaudio and FE
* modify chroma method according to feedback-> better naming
* refactor input_values->input_features
* refactor input_values->input_features and fix import fe
* add input_features to docstrigs
* correct inputs_embeds logics
* remove dtype conversion
* refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation
* change warning for chroma length
* Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* change way to save wav, using soundfile
* correct docs and change to soundfile
* fix import
* fix init proj layers
* remove line breaks from md
* fix issue with docstrings
* add FE suggestions
* improve is in logics and remove useless imports
* remove custom from_pretrained
* simplify docstring code
* add suggestions for modeling tests
* make style
* update converting script with sanity check
* remove encoder attention mask from conditional generation
* replace musicgen melody checkpoints with official orga
* rename ylacombe->facebook in checkpoints
* fix copies
* remove unecessary warning
* add shape in code docstrings
* add files to slow doc tests
* fix md bug and add md to not_tested
* make fix-copies
* fix hidden states test and batching
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
2024-03-18 13:06:12 +00:00
Yoach Lacombe
9a30753485
Porting the torchaudio kaldi fbank implementation to audio_utils ( #26182 )
...
* add kaldi fbank
* make style
* add herz_to_mel_kaldi tests
* add mel to hertz kaldi test
* integration tests
* correct test and remove comment
* make style
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* change parameter name
* Apply suggestions from Arthur review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update remove_dc_offset description
* fix bug + make style
* fix error in using np.exp instead of np.power
* make style
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-09-21 17:52:47 +02:00
Matthijs Hollemans
7f91950901
audio_utils improvements ( #21998 )
...
* silly change to allow making a PR
* clean up doc comments
* simplify hertz_to_mel and mel_to_hertz
* fixup
* clean up power_to_db
* also add amplitude_to_db
* move functions
* clean up mel_filter_bank
* fixup
* credit librosa & torchaudio authors
* add unit tests
* tests for power_to_db and amplitude_to_db
* add mel_filter_bank tests
* rewrite STFT
* add convenience spectrogram function
* missing transpose
* fewer transposes
* add integration test to M-CTC-T
* frame length can be either window or FFT length
* rewrite stft API
* add preemphasis coefficient
* move argument
* add log option to spectrogram
* replace M-CTC-T feature extractor
* fix api thing
* replace whisper STFT
* replace whisper mel filters
* replace tvlt's stft
* allow alternate window names
* replace speecht5 stft
* fixup
* fix integration tests
* fix doc comments
* remove manual FFT length calculation
* fix docs
* go away, deprecation warnings
* combine everything into spectrogram function
* add deprecated functions back
* fixup
2023-05-09 09:10:17 -04:00