HeadlinesBriefing favicon HeadlinesBriefing.com

OpenAI launches Jalapeño custom inference chip with Broadcom

Hacker News •
×

OpenAI unveiled its first custom inference processor on Wednesday, a silicon design built with Broadcom and dubbed Jalapeño. The chip targets the company’s inference workload, running pre‑trained models for user‑facing services. OpenAI’s own models helped shape the architecture, and early testing shows markedly higher performance‑per‑watt than leading GPU alternatives. Its integration into OpenAI’s data centers aims to streamline serving large language models at scale.

OpenAI announced the partnership in October, but rumors of a custom silicon strategy have lingered as the firm seeks to curb reliance on Nvidia GPUs. Competitors Google and Amazon already field AI accelerators, so the move aligns OpenAI with a broader industry shift toward purpose‑built chips that lower operating costs for real‑time coding models. Its architecture emphasizes low‑latency memory and on‑chip scheduling.

OpenAI president Greg Brockman said the company’s deep workload knowledge drove the design, aiming to accelerate underserved inference tasks. While pre‑training will likely remain on Nvidia hardware, even modest gains in inference efficiency can improve margins. Control over silicon lets OpenAI fine‑tune kernels and networking for each model revision. The Jalapeño launch signals OpenAI’s shift from pure model development to end‑to‑end infrastructure control.