GLM 5.1 FP8
Recommended starting point for production agentic workloads on Corvex — strong at code, math, and instruction-following with a 202k context.
zai-org/GLM-5.1-FP8
About
GLM 5.1 FP8 is Z.AI’s most capable open-weights generalist (~754B total, ~40B active, FP8). Strong at code, math, and instruction-following with a 202k-token context window. Recommended starting point for production agentic workloads on Corvex. Served as a single grpc engine with EAGLE speculative decoding.Pricing
| Direction | Rate (USD) |
|---|---|
| Input | $0.50 / 1M tokens |
| Output | $2.00 / 1M tokens |
Capabilities
chatfunction-callingjson-modelong-context
Specs
- Context window: 202k tokens
- License: Apache-2.0
- Released: February 14, 2026
Recommended use cases
- Chat and instruction following
- Tool use and function calling
- Structured-output generation (JSON mode)
- Long-document Q&A and summarization
Benchmarks
Benchmarks fill in during Alpha. Tracked as a follow-up to RD-562.
Example prompts
- Refactor this Go function to use generics and explain the trade-offs.
- Summarise this 80k-token contract into a one-page brief with redlines highlighted.