[Ernie 4.5] Post merge adaptations (#39664)
* ernie 4.5 fixes * Apply style fixes * fix --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This commit is contained in:
@@ -31,7 +31,7 @@ The Ernie 4.5 model was released in the [Ernie 4.5 Model Family](https://ernie.b
|
||||
This family of models contains multiple different architectures and model sizes. This model in specific targets the base text
|
||||
model without mixture of experts (moe) with 0.3B parameters in total. It uses the standard [Llama](./llama.md) at its core.
|
||||
|
||||
Other models from the family can be found at [Ernie 4.5 MoE](./ernie4_5_moe.md).
|
||||
Other models from the family can be found at [Ernie 4.5 Moe](./ernie4_5_moe.md).
|
||||
|
||||
<div class="flex justify-center">
|
||||
<img src="https://ernie.baidu.com/blog/posts/ernie4.5/overview.png"/>
|
||||
|
||||
@@ -23,11 +23,11 @@ rendered properly in your Markdown viewer.
|
||||
</div>
|
||||
</div>
|
||||
|
||||
# Ernie 4.5 MoE
|
||||
# Ernie 4.5 Moe
|
||||
|
||||
## Overview
|
||||
|
||||
The Ernie 4.5 MoE model was released in the [Ernie 4.5 Model Family](https://ernie.baidu.com/blog/posts/ernie4.5/) release by baidu.
|
||||
The Ernie 4.5 Moe model was released in the [Ernie 4.5 Model Family](https://ernie.baidu.com/blog/posts/ernie4.5/) release by baidu.
|
||||
This family of models contains multiple different architectures and model sizes. This model in specific targets the base text
|
||||
model with mixture of experts (moe) - one with 21B total, 3B active parameters and another one with 300B total, 47B active parameters.
|
||||
It uses the standard [Llama](./llama.md) at its core combined with a specialized MoE based on [Mixtral](./mixtral.md) with additional shared
|
||||
@@ -167,17 +167,17 @@ This model was contributed by [Anton Vlasjuk](https://huggingface.co/AntonV).
|
||||
The original code can be found [here](https://github.com/PaddlePaddle/ERNIE).
|
||||
|
||||
|
||||
## Ernie4_5_MoEConfig
|
||||
## Ernie4_5_MoeConfig
|
||||
|
||||
[[autodoc]] Ernie4_5_MoEConfig
|
||||
[[autodoc]] Ernie4_5_MoeConfig
|
||||
|
||||
## Ernie4_5_MoEModel
|
||||
## Ernie4_5_MoeModel
|
||||
|
||||
[[autodoc]] Ernie4_5_MoEModel
|
||||
[[autodoc]] Ernie4_5_MoeModel
|
||||
- forward
|
||||
|
||||
## Ernie4_5_MoEForCausalLM
|
||||
## Ernie4_5_MoeForCausalLM
|
||||
|
||||
[[autodoc]] Ernie4_5_MoEForCausalLM
|
||||
[[autodoc]] Ernie4_5_MoeForCausalLM
|
||||
- forward
|
||||
- generate
|
||||
|
||||
Reference in New Issue
Block a user