[Docs] Model_doc structure/clarity improvements (#26876)

* first batch of structure improvements for model_docs

* second batch of structure improvements for model_docs

* more structure improvements for model_docs

* more structure improvements for model_docs

* structure improvements for cv model_docs

* more structural refactoring

* addressed feedback about image processors
This commit is contained in:
Maria Khalusova
2023-11-03 10:57:03 -04:00
committed by GitHub
parent ad8ff96224
commit 5964f820db
223 changed files with 1796 additions and 1116 deletions

View File

@@ -32,7 +32,11 @@ The metadata such as *artist, genre and timing* are passed to each prior, in the
![JukeboxModel](https://gist.githubusercontent.com/ArthurZucker/92c1acaae62ebf1b6a951710bdd8b6af/raw/c9c517bf4eff61393f6c7dec9366ef02bdd059a3/jukebox.svg)
Tips:
This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ).
The original code can be found [here](https://github.com/openai/jukebox).
## Usage tips
- This model only supports inference. This is for a few reasons, mostly because it requires a crazy amount of memory to train. Feel free to open a PR and add what's missing to have a full integration with the hugging face traineer!
- This model is very slow, and takes 8h to generate a minute long audio using the 5b top prior on a V100 GPU. In order automaticallay handle the device on which the model should execute, use `accelerate`.
- Contrary to the paper, the order of the priors goes from `0` to `1` as it felt more intuitive : we sample starting from `0`.
@@ -67,14 +71,12 @@ The original code can be found [here](https://github.com/openai/jukebox).
- upsample
- _sample
## JukeboxPrior
[[autodoc]] JukeboxPrior
- sample
- forward
## JukeboxVQVAE
[[autodoc]] JukeboxVQVAE