HeadlinesBriefing favicon HeadlinesBriefing.com

Meta's MTIA Chips Target AI Inference with Custom Silicon

TechPowerUp News •
×

Meta has unveiled four new MTIA chips designed for high-performance AI inference, developed in partnership with Broadcom and slated for deployment across its data centers over the next two years. The family includes MTIA 300, 400, 450, and 500 generations, with early units already handling ranking and recommendation workloads. Later designs target real-time model serving for Meta's massive social platforms.

Rather than chasing raw peak arithmetic, Meta prioritized memory throughput and inference efficiency across the lineup. HBM bandwidth and capacity increase substantially from one generation to the next, while compute performance grows more linearly. This approach aims to cut latency and power costs for production inference workloads. The chips also feature hardware support for attention primitives and mixture-of-experts layers, plus low-precision formats tailored for inference to minimize conversion overhead.

The MTIA stack runs natively on common frameworks, allowing existing models to deploy on both GPUs and MTIA without major rewrites. Multiple generations share the same chassis, rack, and networking, enabling upgrades by swapping modules rather than refitting infrastructure. This modularity explains Meta's aggressive release cadence compared to industry norms, given its data centers span millions of chips.