HomeReferenceZ.ai Models

Z.ai models: the complete catalog overview

Every active Z.ai model lineage in one organised reference — GLM-4 generations, the ChatGLM open-weight lineage, code-specialised GLM Coder, multimodal vision-language variants, and embedding models.

5

Active lineages

128K

Max context

100B+

Flagship params

26+

Languages

A catalog read top-down rather than left-to-right

The most useful way to read the Z.ai model catalog is by lineage first, then by variant within a lineage. Start with the family branch you actually need before sweating about parameter sizes.

The Z.ai catalog spans five active lineages and three or four parameter classes within each lineage. That is broader than most open-weight families, and it is the reason readers ask for an overview page like this one — the GLM-4 generation alone has more variants than some entire competitor families publish in a year.

Recap Capsule

Pick the lineage first (chat / code / multimodal / embedding). Then pick the parameter class for your hardware. The right Z.ai variant for your workload almost certainly exists; it just might not be the one with the most marketing attention.

This catalog page is the broader companion to the ai-model page in the Models silo, which focuses on the family-level framing. Together, they cover the catalog from two angles — the high-level overview here, and the practical "pick a variant" framing on the silo page.

The five active lineages and what each is for

A short orientation to each lineage with a one-line "best for" hint.

The GLM-4 family is the general-purpose chat lineage. It covers most everyday workloads — summarisation, classification, structured extraction, conversational interfaces. The 4.5+ refresh extended context windows and tightened instruction-following. For most teams, this is the default starting point.

The ChatGLM open-weight lineage is the historical foundation that put Zhipu AI on the open-weight map. It remains the most-cloned variant for local-inference experiments and the easiest entry point for outside developers running on consumer hardware.

The GLM Coder branch is the code-specialised counterpart. It scores at the top of the open-weight bracket on HumanEval and MBPP. The smaller Coder variants run inside an IDE assistant on a developer laptop without external compute.

The multimodal vision-language variants accept images alongside text. They handle document understanding, OCR-style extraction, image captioning, and structured output from visual sources. The trade-off versus pure-text Z.ai siblings is a slightly higher per-token cost on hosted inference.

The embedding models are smaller, specialised builds optimised for retrieval workloads. They are the right pick for a RAG pipeline, vector-search backend, or semantic-similarity ranking task. They are not chat models and should not be used as such.

Z.ai model lineages mapped to typical workload classes.
LineageRecent release namesParameter classes shipped
GLM-4 familyGLM-4, GLM-4.5+ refreshesSmall / mid / flagship
ChatGLM lineageChatGLM3, ChatGLM4 community builds6B / 9B / mid-size
GLM CoderCodeGeeX / GLM-Coder buildsSmall / mid
Vision-LanguageGLM-4V, multimodal variantsMid / flagship
EmbeddingGLM embedding buildsSmall only

How to read the version numbering

Major-minor numbering. The number does not denote parameter size; that lives in the model card.

Version numbers follow a major-minor pattern. GLM-4 is the fourth generation; GLM-4.5+ is a refresh within the fourth generation. ChatGLM3 is the third generation of the open-weight chat lineage. The number does not denote parameter size — that is communicated separately in the model card. A "small ChatGLM3" and a "small ChatGLM4" are different models with different architectures, not just different weights.

For procurement-minded readers, the most important number on a model card is the parameter count, not the generation. Hardware fit is determined by parameters; quality fit is determined by the lineage and the generation together. NIST publishes useful guidance on AI evaluation methodology that is worth a read alongside the model-card review process.

Frequently asked questions

Five questions cover the most common reader queries about the Z.ai model catalog.

How many distinct model lineages does Z.ai ship?

Z.ai currently ships five active lineages — the general-purpose GLM-4 family, the open-weight ChatGLM lineage, the code-specialised GLM Coder branch, multimodal vision-language variants, and embedding models for retrieval workloads.

How should I read the version numbering?

Version numbers follow a major-minor pattern. GLM-4 is the fourth generation; GLM-4.5+ indicates a refresh within that generation. The number does not denote parameter size — that is communicated separately in the model card.

How often does the catalog refresh?

The Zhipu AI team ships several releases per quarter across the various lineages. Reading the catalog top-down by lineage is more useful than tracking individual version bumps; pick a lineage first, then the right variant within it.

Are all variants available with open weights?

Most ChatGLM and many GLM-4 family variants ship with open weights on Hugging Face. Some flagship sizes are hosted-only by default. The model card for each release states the license footprint clearly.

Where do code variants fit in the catalog?

GLM Coder is the code-specialised branch. It is fine-tuned on a curated programming corpus and routinely scores at the top of the open-weight bracket on HumanEval and MBPP. The smaller Coder variants run on consumer hardware.

Related Z.ai topics

Cluster keyword anchors for readers who want to dig into a specific lineage or surface.

For deeper coverage, the GLM AI model page covers the family at the silo level, the ChatGLM reference covers the open-weight lineage, the Zhipu AI LLM page covers text-line positioning, the benchmarks page covers how the family scores publicly, and the AI model page covers the practical "pick a variant" framing.