HeadlinesBriefing favicon HeadlinesBriefing.com

Claude Code: More Tokens, Worse Results?

DEV Community •
×

Companies increasingly measure AI productivity through token usage, but one developer questioned if this metric truly reflects code quality. An experiment tested Claude Code by building the same CLI Tic Tac Toe game four times with different techniques. The goal was to see if throwing more computational resources at a problem actually improves the final product or just inflates costs without better outcomes.

The four approaches included a raw zero-shot prompt, using Plan Mode, a CLAUDE.md context file, and combining both context and planning. The author tracked token consumption and used a QA agent to score the code on clarity and structure. This methodology aimed to calculate a 'quality per token' metric, directly challenging the assumption that more input equals better output.

The results were revealing. The cheapest method, using only a CLAUDE.md context file, delivered the highest quality score (4.9) and the best efficiency. Conversely, the most expensive approach using Plan Mode required over double the tokens for slightly lower quality. This proves that strategic context engineering beats raw token volume, a critical insight for businesses optimizing their AI spend and workflow.