GLM-4.5V Model Support (#39805)
Some checks failed
Secret Leaks / trufflehog (push) Has been cancelled

* init

* update

* uupdate

* ruff

* t patch is 2 defalut not 1

* draft

* back

* back1

* update

* config update

* update using glm-41 format

* add self.rope_scaling = config.rope_scaling

* update config

* update

* remove the processor

* update

* fix tests

* update

* for test

* update

* update 2126

* self.rope_scaling is missing in GLM4MOE lets add it

* update

* update

* Update modular_glm4v_moe.py

* change config

* update apply_multimodal_rotary_pos_emb

* format

* update

* Delete 3-rollout_qas_thinking_answers.py

* use right name

* update with place holder

* update

* use right rotary

* Update image_processing_glm4v_fast.py

* rope_config_validation needs to rewrite the entire config file in modular

* update

* changed name

* update

* Update modeling_glm4v_moe.py

* _init_weights shoud be add in Glm4vMoePreTrainedModel

* remove use_qk_norm

* Update modular_glm4v_moe.py

* remove use_qk_norm as it is not use

* fix style

* deprecations are not needed on new models

* fix merge issues

---------

Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur <arthur.zucker@gmail.com>
This commit is contained in:
Yuxuan Zhang
2025-08-08 23:39:52 +08:00
committed by GitHub
parent d2ba153b29
commit 7b20915f4e
20 changed files with 3723 additions and 536 deletions

View File

@@ -1009,6 +1009,8 @@
title: GIT
- local: model_doc/glm4v
title: glm4v
- local: model_doc/glm4v_moe
title: glm4v_moe
- local: model_doc/got_ocr2
title: GOT-OCR2
- local: model_doc/granitevision

View File

@@ -0,0 +1,64 @@
<!--Copyright 2025 The ZhipuAI Inc. and The HuggingFace Inc. team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
<div style="float: right;">
<div class="flex flex-wrap space-x-1">
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
<img alt="FlashAttention" src="https://img.shields.io/badge/%E2%9A%A1%EF%B8%8E%20FlashAttention-eae0c8?style=flat">
<img alt="SDPA" src="https://img.shields.io/badge/SDPA-DE3412?style=flat&logo=pytorch&logoColor=white"> </div>
</div>
# Glm4vMoe
## Overview
The Glm4vMoe model was proposed in [<INSERT PAPER NAME HERE>](<INSERT PAPER LINK HERE>) by <INSERT AUTHORS HERE>.
<INSERT SHORT SUMMARY HERE>
The abstract from the paper is the following:
*<INSERT PAPER ABSTRACT HERE>*
Tips:
<INSERT TIPS ABOUT MODEL HERE>
This model was contributed by [INSERT YOUR HF USERNAME HERE](https://huggingface.co/<INSERT YOUR HF USERNAME HERE>).
The original code can be found [here](<INSERT LINK TO GITHUB REPO HERE>).
## Glm4vMoeConfig
[[autodoc]] Glm4vMoeConfig
## Glm4vMoeTextConfig
[[autodoc]] Glm4vMoeTextConfig
## Glm4vMoeTextModel
[[autodoc]] Glm4vMoeTextModel
- forward
## Glm4vMoeModel
[[autodoc]] Glm4vMoeModel
- forward
## Glm4vMoeForConditionalGeneration
[[autodoc]] Glm4vMoeForConditionalGeneration
- forward