HeadlinesBriefing favicon HeadlinesBriefing.com

ETH Zurich Study Questions AGENTS.md Files for AI Coding

Hacker News •
×

A new ETH Zurich paper challenges the widespread practice of using AGENTS.md files for AI coding agents. Researchers found that LLM-generated context files actually degrade performance by 3% on average, while increasing inference costs by over 20%. Human-written files showed only marginal gains of 4% in task success rates.

To test real-world impact, the team built AGENTbench, a dataset of 138 Python tasks from niche repositories. This avoided the memorization bias of popular benchmarks like SWE-bench. The study tested four agents including Claude 3.5 Sonnet and GPT models across three scenarios: no context file, LLM-generated file, and human-written file.

Deep trace analysis revealed that agents follow AGENTS.md instructions too literally, running unnecessary tests and file searches. The researchers concluded that context files have only marginal effects on agent behavior and are likely only desirable when manually written. This finding highlights a concrete gap between current developer recommendations and observed outcomes, suggesting the need for principled ways to automatically generate concise, task-relevant guidance for coding agents.