[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors
This commit is contained in:
@@ -21,7 +21,6 @@ rendered properly in your Markdown viewer.
|
||||
Flan-UL2 is an encoder decoder model based on the T5 architecture. It uses the same configuration as the [UL2](ul2) model released earlier last year.
|
||||
It was fine tuned using the "Flan" prompt tuning and dataset collection. Similar to `Flan-T5`, one can directly use FLAN-UL2 weights without finetuning the model:
|
||||
|
||||
|
||||
According to the original blog here are the notable improvements:
|
||||
|
||||
- The original UL2 model was only trained with receptive field of 512, which made it non-ideal for N-shot prompting where N is large.
|
||||
@@ -29,9 +28,6 @@ According to the original blog here are the notable improvements:
|
||||
- The original UL2 model also had mode switch tokens that was rather mandatory to get good performance. However, they were a little cumbersome as this requires often some changes during inference or finetuning. In this update/change, we continue training UL2 20B for an additional 100k steps (with small batch) to forget “mode tokens” before applying Flan instruction tuning. This Flan-UL2 checkpoint does not require mode tokens anymore.
|
||||
Google has released the following variants:
|
||||
|
||||
|
||||
One can refer to [T5's documentation page](t5) for all tips, code examples and notebooks. As well as the FLAN-T5 model card for more details regarding training and evaluation of the model.
|
||||
|
||||
The original checkpoints can be found [here](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-ul2-checkpoints).
|
||||
|
||||
|
||||
@@ -51,6 +47,8 @@ The model is pretty heavy (~40GB in half precision) so if you just want to run t
|
||||
['In a large skillet, brown the ground beef and onion over medium heat. Add the garlic']
|
||||
```
|
||||
|
||||
## Inference
|
||||
<Tip>
|
||||
|
||||
The inference protocol is exactly the same as any `T5` model, please have a look at the [T5's documentation page](t5) for more details.
|
||||
Refer to [T5's documentation page](t5) for API reference, tips, code examples and notebooks.
|
||||
|
||||
</Tip>
|
||||
|
||||
Reference in New Issue
Block a user