Glm 4 doc (#39247)
* update the glm4 model readme * update test * update GLM-4.1V model * update as format * update * fix some tests * fix the rest * fix on a10, not t4 * nit: dummy import --------- Co-authored-by: raushan <raushan@huggingface.co>
This commit is contained in:
@@ -18,7 +18,37 @@ rendered properly in your Markdown viewer.
|
||||
|
||||
## Overview
|
||||
|
||||
To be released with the official model launch.
|
||||
The GLM family welcomes new members [GLM-4-0414](https://arxiv.org/pdf/2406.12793) series models.
|
||||
|
||||
The **GLM-4-32B-0414** series models, featuring 32 billion parameters. Its performance is comparable to OpenAI’s GPT
|
||||
series and DeepSeek’s V3/R1 series. It also supports very user-friendly local deployment features. GLM-4-32B-Base-0414
|
||||
was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data. This lays the
|
||||
foundation for subsequent reinforcement learning extensions. In the post-training stage, we employed human preference
|
||||
alignment for dialogue scenarios. Additionally, using techniques like rejection sampling and reinforcement learning, we
|
||||
enhanced the model’s performance in instruction following, engineering code, and function calling, thus strengthening
|
||||
the atomic capabilities required for agent tasks. GLM-4-32B-0414 achieves good results in engineering code, Artifact
|
||||
generation, function calling, search-based Q&A, and report generation. In particular, on several benchmarks, such as
|
||||
code generation or specific Q&A tasks, GLM-4-32B-Base-0414 achieves comparable performance with those larger models like
|
||||
GPT-4o and DeepSeek-V3-0324 (671B).
|
||||
|
||||
**GLM-Z1-32B-0414** is a reasoning model with deep thinking capabilities. This was developed based on GLM-4-32B-0414
|
||||
through cold start, extended reinforcement learning, and further training on tasks including mathematics, code, and
|
||||
logic. Compared to the base model, GLM-Z1-32B-0414 significantly improves mathematical abilities and the capability to
|
||||
solve complex tasks. During training, we also introduced general reinforcement learning based on pairwise ranking
|
||||
feedback, which enhances the model's general capabilities.
|
||||
|
||||
**GLM-Z1-Rumination-32B-0414** is a deep reasoning model with rumination capabilities (against OpenAI's Deep Research).
|
||||
Unlike typical deep thinking models, the rumination model is capable of deeper and longer thinking to solve more
|
||||
open-ended and complex problems (e.g., writing a comparative analysis of AI development in two cities and their future
|
||||
development plans). Z1-Rumination is trained through scaling end-to-end reinforcement learning with responses graded by
|
||||
the ground truth answers or rubrics and can make use of search tools during its deep thinking process to handle complex
|
||||
tasks. The model shows significant improvements in research-style writing and complex tasks.
|
||||
|
||||
Finally, **GLM-Z1-9B-0414** is a surprise. We employed all the aforementioned techniques to train a small model (9B).
|
||||
GLM-Z1-9B-0414 exhibits excellent capabilities in mathematical reasoning and general tasks. Its overall performance is
|
||||
top-ranked among all open-source models of the same size. Especially in resource-constrained scenarios, this model
|
||||
achieves an excellent balance between efficiency and effectiveness, providing a powerful option for users seeking
|
||||
lightweight deployment.
|
||||
|
||||
## Glm4Config
|
||||
|
||||
|
||||
Reference in New Issue
Block a user