Z.ai integrations: third-party toolchain reference

How to integrate Z.ai and the BigModel API with the tools developers already use — LangChain, LlamaIndex, Cursor, Continue.dev, Haystack, Ollama community shims, n8n, and Zapier. Each integration pattern is documented from the perspective of what needs to change versus an OpenAI-based setup.

8 tools

Covered here

OpenAI-compat

API contract

base_url swap

Core pattern

SDK + shim

Integration modes

Working Memo

The core integration pattern for Z.ai is a base URL swap. Because the BigModel API follows the OpenAI chat-completions contract, any tool that already supports OpenAI via a configurable base URL can reach Z.ai with two changes: the base URL and the API key. Tools that do not support a configurable base URL require a shim or a community adapter.

The integration foundation: OpenAI compatibility

The BigModel API mirrors the OpenAI chat-completions contract, which means any tool that supports a custom OpenAI base URL can integrate Z.ai in minutes.

The BigModel API endpoint accepts requests formatted identically to OpenAI's chat-completions endpoint. The request body uses the same model, messages, temperature, and max_tokens fields. The response object follows the same structure, including the choices array and the usage token-count block. This compatibility means that the integration effort for most modern developer tools is limited to changing two values in a configuration file or settings screen: the base URL and the API key.

The GLM model identifiers used in the model field differ from OpenAI's identifiers — you use glm-4, glm-4-flash, or similar strings rather than gpt-4o. This is the only semantic change in the request object for standard chat-completions integrations. Tools that hard-code model names (rather than taking them from a configuration field) will need a minor code change at the call site.

Framework integrations

LangChain, LlamaIndex, and Haystack each have first-party or community-maintained support for Z.ai.

LangChain supports Z.ai through the ChatZhipuAI class in the langchain-community package. Instantiating it with a BigModel API key and a model name produces a chat model object that works identically to any other LangChain LLM — it can be dropped into chains, agents, and retrievers without modification. The class handles authentication, request formatting, and streaming responses internally. For teams already using LangChain with OpenAI, switching to Z.ai is a single class-name change at the model-constructor call.

LlamaIndex exposes Z.ai through an OpenAI-compatible LLM class with a custom base URL parameter. Setting OpenAI(base_url="https://open.bigmodel.cn/api/paas/v4/", api_key="...") in a LlamaIndex context routes all completions through BigModel. The model identifier needs to be a GLM name rather than a GPT name. All LlamaIndex abstractions — query engines, RAG pipelines, agent loops — work without modification once the LLM object is pointed at BigModel.

Haystack supports custom OpenAI-compatible providers through its OpenAIChatGenerator component, which accepts a api_base_url parameter. The integration pattern is the same as LlamaIndex: configure the base URL, supply the BigModel API key, and specify a GLM model name. Haystack pipelines built for OpenAI migrate to Z.ai with changes only to the generator component configuration.

Code editor integrations: Cursor and Continue.dev

Both Cursor and Continue.dev support custom model endpoints, allowing GLM models to power code completions and chat in the editor.

Cursor exposes model configuration through its settings panel under the Models section. Adding a custom model with an OpenAI-compatible API format, entering the BigModel base URL, and supplying a BigModel API key routes Cursor's completion and chat requests to the GLM family. The GLM code-specialised variants are appropriate for code completion workloads; the general-purpose GLM-4 tier handles the conversational Composer and Ask modes well.

Continue.dev uses a JSON configuration file (~/.continue/config.json) that accepts a model block with a provider of openai, a custom apiBase, and an apiKey. Setting these to the BigModel values adds a Z.ai model to the Continue model selector. Continue's autocomplete and chat features then use the configured GLM model. Community contributors have published pre-built Continue configurations for specific GLM variants that cover the recommended context window and temperature settings.

Ollama and automation tools

Ollama runs open-weight GLM builds locally; n8n and Zapier connect Z.ai to workflow automation pipelines.

Ollama serves open-weight models locally. For the GLM GGUF builds downloaded from community mirrors on Hugging Face, Ollama can load and serve them directly once the GGUF file is placed in the correct model directory. Community shims also exist that expose the BigModel closed-API models as Ollama-compatible endpoints for clients that prefer the Ollama interface. The shim approach is less stable than the direct BigModel API integration and works best for development rather than production use.

n8n's AI nodes include an OpenAI-compatible LLM node that accepts a custom base URL. Setting this to the BigModel endpoint and providing an API key makes GLM models available as AI action nodes in any n8n workflow. Zapier does not have a first-party Z.ai action, but the OpenAI action with a custom API configuration and the webhook action both serve as viable integration paths for teams that need GLM capability in Zapier automation sequences. Research on AI workflow orchestration from NIST is useful context for teams embedding AI models in production automation pipelines.

Z.ai integration patterns — five key tools, their integration method, and notes
ToolIntegration patternNotes
LangChainChatZhipuAI class in langchain-communityFirst-party support; drop-in replacement for ChatOpenAI in chains and agents
LlamaIndexOpenAI(base_url=BigModel_URL, api_key=...)Uses the OpenAI-compatible class; all LlamaIndex abstractions work unchanged
Cursor / Continue.devCustom model endpoint in settings or config.jsonEnter BigModel base URL and API key; use GLM model identifier strings
OllamaDirect GGUF loading for open-weight builds; community shim for closed-API modelsShim approach works for development; direct BigModel API preferred for production
n8nOpenAI-compatible LLM node with custom base URLFull node functionality available; no dedicated Z.ai node required

Z.ai integrations frequently asked questions

Four questions on LangChain support, Ollama shims, automation tools, and Cursor configuration.

Does LangChain support Z.ai as a model provider?

LangChain supports Z.ai through the ChatZhipuAI class in langchain-community, which wraps the BigModel API. It accepts a BigModel API key and a GLM model identifier, and exposes the same chain interface as other LangChain LLM providers. Switching from ChatOpenAI is a single class-name change.

Can I use Z.ai with Ollama?

For open-weight GLM builds downloaded as GGUF files, Ollama serves them directly after placing the file in the model directory. For the closed-API GLM models on BigModel, community shims expose the BigModel endpoint as an Ollama-compatible interface. The direct BigModel API integration is preferred for production use.

Does Cursor support Z.ai models?

Cursor supports custom model endpoints through its settings. Pointing the OpenAI-compatible base URL field at the BigModel API and entering a BigModel API key routes completions and chat requests to the GLM family. The GLM code-specialised variants work well for completion workloads.

How do I connect Z.ai to n8n or Zapier?

n8n has an OpenAI-compatible LLM node that accepts a custom base URL, making BigModel API integration straightforward. Zapier does not have a native Z.ai action, but the OpenAI action with a custom configuration and the webhook action both serve as viable integration paths for automation workflows.

Z.ai integrations in the developer ecosystem

How the integrations reference connects to the API, download, and platform pages on this site.

Every integration on this page routes through either the Zhipuai API for hosted model access or the open-weight builds available through the zhipuai download path. The BigModel AI console is where API keys are issued and usage is tracked regardless of which integration framework sits above it. For teams evaluating whether to use Z.ai through an integration versus a direct API call, the Zhipu AI pricing reference applies equally to both paths — the token billing is the same. The Zhipu AI GitHub repositories contain the inference server code that some framework integrations call into when running open-weight models locally, making the two pages natural companions for self-hosted setups.