HeadlinesBriefing favicon HeadlinesBriefing.com

Dirac coding agent tops benchmark with 65% score and major cost cuts

Hacker News •
×

Open‑source coding agent Dirac has claimed the top spot on the Terminal‑Bench‑2 leaderboard for the gemini‑3‑flash‑preview model, posting a 65.2% success rate. The result outpaces Google’s official baseline (around 47%) and the leading closed‑source tool Junie CLI (64.3%). Evaluations ran on public GitHub repositories without inserting any benchmark‑specific files, preserving a fair test environment. The open‑source nature lets developers audit and extend the tool freely.

Dirac’s architecture focuses on tight context curation to curb the drop in reasoning ability that large prompts causes. It employs hash‑anchored parallel edits, abstract‑syntax‑tree manipulation, and a suite of token‑saving tricks, delivering an average 64.8% cost reduction compared with competing agents. Benchmarks across tasks such as transformer refactoring and Django updates show near‑perfect accuracy while trimming API spend.

Users can access Dirac via a VS Code extension or a global npm CLI, with support for multiple AI providers through simple environment variables. Commands like “plan mode” let developers preview strategies before execution, while “Yolo mode” automates straightforward fixes. Licensed under Apache 2.0, the project builds on the Cline codebase and is maintained by Max Trivedi at Dirac Delta Labs today.