GLM 5.1

GLM-5.1 is Z.ai's next-generation flagship model built for agentic engineering, with stronger coding capabilities and sustained performance over long-horizon tasks with hundreds of iteration rounds. It's a 754B-parameter MoE model

GLM 5.1 API Features

Fine-tuning Docs	GLM 5.1 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
Serverless Docs	Immediately run model on pre-configured GPUs and pay-per-token
On-demand Deployment Docs	On-demand deployments give you dedicated GPUs for GLM 5.1 using Fireworks' reliable, high-performance system with no rate limits.

Available Serverless

Run queries immediately, pay only for usage

$1.40 / $0.26 / $4.40

Per 1M Tokens (input/cached input/output)

Metadata

State

Unknown

Created on

N/A

Kind

Unknown

Provider

Z.ai

Specification

Calibrated

Mixture-of-Experts

Parameters

N/A

Supported Functionality

Fine-tuning

Supported

Serverless

Supported

Context Length

202.8k tokens

Function Calling

Not supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Not supported

GLM 5.1

GLM 5.1 API Features

Fine-tuning

Serverless

On-demand Deployment

Available Serverless

Metadata

Specification

Supported Functionality