HeadlinesBriefing favicon HeadlinesBriefing.com

Zyphra’s ZAYA1‑8B: 760 M Active Params Matching Top Models on Math

Hacker News •
×

Zyphra unveiled ZAYA1‑8B, a 8.4 B‑parameter mixture‑of‑experts model that outperforms DeepSeek‑R1 on math, matches Claude Sonnet 4.5 on reasoning, and nears Gemini 2.5 Pro on coding—all while keeping only 760 M active parameters at inference. The work demonstrates that frontier performance can be achieved with less than one billion active weights.

Unusually, Zyphra trained the entire pipeline on a 1,024‑node AMD Instinct MI300X cluster built with IBM’s Pensando Pollara interconnect, sidestepping the NVIDIA‑centric CUDA ecosystem that dominates the field. This choice proves the AMD stack can match or exceed NVIDIA‑based models at scale, offering a viable alternative for labs wary of hardware lock‑in in the next decade for projects.

ZAYA1‑8B leverages a custom attention scheme that keeps reasoning quality high even with a reduced active budget. Coupled with Zyphra’s Markovian RSA inference—parallel reasoning traces that stay within a bounded context—the model gains a significant boost on benchmarks like AIME 2026, scoring 89.1 versus competitors’ 71.6 and 54 other tasks, cementing its niche for academic research.

Despite its strengths, ZAYA1‑8B falls short on agentic benchmarks such as BFCL‑V4 and TAU2, indicating limited tool‑calling and instruction‑following capabilities. The model suits niche applications that demand rigorous math, science, or code reasoning, especially where low inference cost matters. It is available via Zyphra Cloud or Hugging Face under an Apache 2.0 license for developers worldwide.