HeadlinesBriefing favicon HeadlinesBriefing.com

GLM-4.7-Flash 30B MoE Model Release

Hacker News: Front Page •
×

GLM Team released GLM-4.7-Flash, a 30B-parameter Mixture-of-Experts model positioning it as the strongest in its class for lightweight deployment. It targets developers needing a balance of performance and efficiency, with benchmarks showing competitive results against models like Qwen3-30B and GPT-OSS-20B.

The model supports inference frameworks vLLM and SGLang for local deployment, with detailed setup instructions provided. Benchmarks highlight strengths in SWE-bench Verified (59.2) and τ²-Bench (79.5), suggesting practical utility for coding and reasoning tasks, though it trails in some areas like LCB v6.

This release follows the earlier GLM-4.5 paper, which focused on agentic reasoning and coding. The new model is available via Z.ai's API and Hugging Face, signaling a push toward accessible, efficient open models for developers.