HeadlinesBriefing favicon HeadlinesBriefing.com

Why ReAct Agents Burn 90% of Retries and How to Fix Them

Towards Data Science •
×

ML engineers using ReAct loops in LangChain, LangGraph or custom tool chains have learned that their agents squander the majority of retry attempts on failures that can never succeed. In a controlled 200‑task benchmark, 90.8% of retries were spent on hallucinated tool calls, not on model mispredictions. The waste stems from letting the model pick tool names at runtime.

The common pattern—passing the LLM’s output directly to TOOLS.get()—returns None for unknown names, yet the global retry counter treats this as a transient fault. Because the error class never changes, each hallucination burns three retry slots, draining resources that could handle genuine timeouts. The article proposes three structural fixes: error taxonomy, per‑tool circuit breakers, and deterministic routing.

Applying those changes to the same benchmark eliminated every impossible retry, pushing success to 100 % and cutting step‑variance threefold. Engineers can now see true failure modes in dashboards instead of a misleading “retries within limits” metric. The findings urge a shift from prompt tweaking to architectural safeguards whenever agents expose tool names dynamically.

The author supplies a reproducible Python script and a GitHub repo so teams can audit their own agents. By moving tool resolution out of the LLM and classifying errors before retrying, cost per token drops and latency steadies, delivering more predictable production behavior without sacrificing the 89.5 % success rate seen in the original ReAct setup.