HeadlinesBriefing favicon HeadlinesBriefing.com

LLM Neuroanatomy: How I Topped AI Leaderboard Without Training

Hacker News •
×

In mid-2024, a bizarre experiment topped the HuggingFace Open LLM Leaderboard without training a single weight. dnhkng/RYS-XLarge claimed the #1 spot across six benchmarks by duplicating seven middle layers of an existing 72-billion parameter model. No gradient descent, no fine-tuning—just architectural rearrangement.

This discovery emerged from two strange observations. First, base64 encoding worked surprisingly well for LLM inputs, suggesting early layers translate inputs into abstract representations and late layers translate back out. Second, the Goliath-120b model functioned despite feeding later layer outputs into earlier ones—a fundamental violation of transformer architecture principles. These anomalies hinted that middle layers operate in a universal internal language.

The technique exploits what the author calls 'LLM Neuroanatomy'—the idea that transformers have functional regions like a brain. Early layers handle input translation, late layers handle output translation, and middle layers perform abstract reasoning in a representation-agnostic format. By duplicating these reasoning layers, the model gained more 'thinking capacity' without learning new information.

This architectural hack challenges assumptions about how transformers work internally. If middle layers truly operate in a universal abstract space, it suggests intelligence in large language models emerges from structural arrangement rather than learned weights alone.