HeadlinesBriefing favicon HeadlinesBriefing.com

Autoresearch AI Boosts Research Performance

Hacker News •
×

A researcher tested Andrej Karpathy's Autoresearch on an old eCLIP project over the weekend. The AI agent ran 42 experiments, committing 13 and reverting 29, improving mean rank from 344.68 to 157.43 (54% reduction). The sandboxed agent worked within strict constraints, only modifying train.py while monitoring evaluation metrics.

The temperature clamp fix delivered the biggest improvement (-113 mean rank), while hyperparameter tuning accounted for most other gains. When the agent ventured into architectural changes and "moonshot ideas," success rates dropped significantly. The researcher containerized the training loop and restricted Claude Code's permissions for security.

Autoresearch proved effective for structured optimization problems where search space is clearly defined, but struggled with unknown unknowns. The agent's approach resembles a hyperparameter optimization algorithm with basic reasoning, automating tedious work researchers typically dislike. The experiment demonstrated both potential and limitations of current AI research assistants.