[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
This commit is contained in:
@@ -14,8 +14,7 @@ specific language governing permissions and limitations under the License.
|
||||
|
||||
## Overview
|
||||
|
||||
Bark is a transformer-based text-to-speech model proposed by Suno AI in [suno-ai/bark](https://github.com/suno-ai/bark).
|
||||
|
||||
Bark is a transformer-based text-to-speech model proposed by Suno AI in [suno-ai/bark](https://github.com/suno-ai/bark).
|
||||
|
||||
Bark is made of 4 main models:
|
||||
|
||||
@@ -26,6 +25,9 @@ Bark is made of 4 main models:
|
||||
|
||||
It should be noted that each of the first three modules can support conditional speaker embeddings to condition the output sound according to specific predefined voice.
|
||||
|
||||
This model was contributed by [Yoach Lacombe (ylacombe)](https://huggingface.co/ylacombe) and [Sanchit Gandhi (sanchit-gandhi)](https://github.com/sanchit-gandhi).
|
||||
The original code can be found [here](https://github.com/suno-ai/bark).
|
||||
|
||||
### Optimizing Bark
|
||||
|
||||
Bark can be optimized with just a few extra lines of code, which **significantly reduces its memory footprint** and **accelerates inference**.
|
||||
@@ -86,7 +88,7 @@ model.enable_cpu_offload()
|
||||
|
||||
Find out more on inference optimization techniques [here](https://huggingface.co/docs/transformers/perf_infer_gpu_one).
|
||||
|
||||
### Tips
|
||||
### Usage tips
|
||||
|
||||
Suno offers a library of voice presets in a number of languages [here](https://suno-ai.notion.site/8b8e8749ed514b0cbf3f699013548683?v=bc67cff786b04b50b3ceb756fd05f68c).
|
||||
These presets are also uploaded in the hub [here](https://huggingface.co/suno/bark-small/tree/main/speaker_embeddings) or [here](https://huggingface.co/suno/bark/tree/main/speaker_embeddings).
|
||||
@@ -142,11 +144,6 @@ To save the audio, simply take the sample rate from the model config and some sc
|
||||
>>> write_wav("bark_generation.wav", sample_rate, audio_array)
|
||||
```
|
||||
|
||||
|
||||
This model was contributed by [Yoach Lacombe (ylacombe)](https://huggingface.co/ylacombe) and [Sanchit Gandhi (sanchit-gandhi)](https://github.com/sanchit-gandhi).
|
||||
The original code can be found [here](https://github.com/suno-ai/bark).
|
||||
|
||||
|
||||
## BarkConfig
|
||||
|
||||
[[autodoc]] BarkConfig
|
||||
|
||||
Reference in New Issue
Block a user