HeadlinesBriefing favicon HeadlinesBriefing.com

CodeRLM: Tree-sitter-backed code indexing for LLM agents

Hacker News: Front Page •
×

A developer has created CodeRLM, a tool that fundamentally changes how LLM coding agents explore codebases by replacing the traditional glob/grep/read cycle with tree-sitter-backed indexing. The tool implements concepts from the Recursive Language Models paper by Zhang, Kraska, and Khattab at MIT CSAIL, treating codebases as searchable environments rather than flat directories.

CodeRLM uses a Rust server to index projects with tree-sitter, building a symbol table with cross-references that agents can query for structure, symbols, implementations, callers, and grep results. The agent workflow involves initializing a session, exploring structure, searching symbols, retrieving implementations, finding callers, and falling back to text search when needed. The server currently supports Rust, Python, TypeScript, JavaScript, and Go for symbol parsing.

Early testing shows significant advantages over native tools. When exploring a codebase to identify structural improvements, the CodeRLM-enabled instance found semantic issues like duplicated code with identical names, orphaned code, and naming convention mismatches in about 3 minutes compared to 8 minutes using native tools. The indexed approach excelled at catching semantic problems rather than just file organization issues. While installation requires some manual setup including the Rust toolchain, the tool ships as a Claude Code plugin with hooks that guide agents toward indexed lookups instead of native file operations.