HeadlinesBriefing favicon HeadlinesBriefing.com

Claude Code quota drains fast on Pro Max 5x plan

Hacker News •
×

A developer using the Pro Max 5x Claude Code plan reported that its token quota vanished in just 1.5 hours after a reset, despite only light Q&A and occasional tool calls. The same user spent five hours on heavy development earlier, which consumed the previous quota window as expected. The post‑reset depletion sparked a deep dive into token accounting.

Investigation of session logs from Claude Code v2.1.97 on WSL2 shows that cache‑read tokens were counted at full rate against the rate limit. Instead of the advertised 1/10 cost, the cache reads contributed roughly 100 million tokens in the 1.5‑hour window, dwarfing the modest 300 API calls made. Background sessions and automatic context compacts added further hidden consumption.

The bug effectively nullifies the cost advantage of prompt caching and makes the 1 M‑token context window a liability; each API call can consume near a million tokens regardless of user activity. Fixes suggested include documenting cache‑read accounting, applying the reduced rate for rate limiting, and preventing idle sessions from draining shared quota. Until addressed, Pro Max 5x users cannot rely on moderate usage lasting more than a couple of hours.