[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
This commit is contained in:
@@ -37,7 +37,10 @@ state-of-the-art performance in universal representation learning, especially fo
|
||||
tasks. An ablation study is performed verifying the efficacy of each proposed method. Finally, we scale up training
|
||||
dataset to 94 thousand hours public audio data and achieve further performance improvement in all SUPERB tasks.*
|
||||
|
||||
Tips:
|
||||
This model was contributed by [patrickvonplaten](https://huggingface.co/patrickvonplaten). The Authors' code can be
|
||||
found [here](https://github.com/microsoft/UniSpeech/tree/main/UniSpeech-SAT).
|
||||
|
||||
## Usage tips
|
||||
|
||||
- UniSpeechSat is a speech model that accepts a float array corresponding to the raw waveform of the speech signal.
|
||||
Please use [`Wav2Vec2Processor`] for the feature extraction.
|
||||
@@ -45,10 +48,7 @@ Tips:
|
||||
decoded using [`Wav2Vec2CTCTokenizer`].
|
||||
- UniSpeechSat performs especially well on speaker verification, speaker identification, and speaker diarization tasks.
|
||||
|
||||
This model was contributed by [patrickvonplaten](https://huggingface.co/patrickvonplaten). The Authors' code can be
|
||||
found [here](https://github.com/microsoft/UniSpeech/tree/main/UniSpeech-SAT).
|
||||
|
||||
## Documentation resources
|
||||
## Resources
|
||||
|
||||
- [Audio classification task guide](../tasks/audio_classification)
|
||||
- [Automatic speech recognition task guide](../tasks/asr)
|
||||
|
||||
Reference in New Issue
Block a user