[Ernie 4.5] Post merge adaptations (#39664)

* ernie 4.5 fixes

* Apply style fixes

* fix

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This commit is contained in:
Anton Vlasjuk
2025-07-25 17:36:18 +02:00
committed by GitHub
parent 5d0ba3e479
commit a91653561e
10 changed files with 126 additions and 101 deletions

View File

@@ -31,7 +31,7 @@ The Ernie 4.5 model was released in the [Ernie 4.5 Model Family](https://ernie.b
This family of models contains multiple different architectures and model sizes. This model in specific targets the base text
model without mixture of experts (moe) with 0.3B parameters in total. It uses the standard [Llama](./llama.md) at its core.
Other models from the family can be found at [Ernie 4.5 MoE](./ernie4_5_moe.md).
Other models from the family can be found at [Ernie 4.5 Moe](./ernie4_5_moe.md).
<div class="flex justify-center">
<img src="https://ernie.baidu.com/blog/posts/ernie4.5/overview.png"/>

View File

@@ -23,11 +23,11 @@ rendered properly in your Markdown viewer.
</div>
</div>
# Ernie 4.5 MoE
# Ernie 4.5 Moe
## Overview
The Ernie 4.5 MoE model was released in the [Ernie 4.5 Model Family](https://ernie.baidu.com/blog/posts/ernie4.5/) release by baidu.
The Ernie 4.5 Moe model was released in the [Ernie 4.5 Model Family](https://ernie.baidu.com/blog/posts/ernie4.5/) release by baidu.
This family of models contains multiple different architectures and model sizes. This model in specific targets the base text
model with mixture of experts (moe) - one with 21B total, 3B active parameters and another one with 300B total, 47B active parameters.
It uses the standard [Llama](./llama.md) at its core combined with a specialized MoE based on [Mixtral](./mixtral.md) with additional shared
@@ -167,17 +167,17 @@ This model was contributed by [Anton Vlasjuk](https://huggingface.co/AntonV).
The original code can be found [here](https://github.com/PaddlePaddle/ERNIE).
## Ernie4_5_MoEConfig
## Ernie4_5_MoeConfig
[[autodoc]] Ernie4_5_MoEConfig
[[autodoc]] Ernie4_5_MoeConfig
## Ernie4_5_MoEModel
## Ernie4_5_MoeModel
[[autodoc]] Ernie4_5_MoEModel
[[autodoc]] Ernie4_5_MoeModel
- forward
## Ernie4_5_MoEForCausalLM
## Ernie4_5_MoeForCausalLM
[[autodoc]] Ernie4_5_MoEForCausalLM
[[autodoc]] Ernie4_5_MoeForCausalLM
- forward
- generate