HeadlinesBriefing favicon HeadlinesBriefing.com

Building a Tiny Computer Inside a Transformer

Towards Data Science •
×

A developer spent their Easter weekend building a tiny computer inside a transformer model. Rather than training a model to discover an algorithm through optimization, they compiled a program directly into the transformer's weights, making it execute a deterministic computation graph without any training. The approach treats the residual stream as working memory and each layer as a machine step.

The author demonstrated this with a trivial program: given input B, look up the value (5), add 1, and output 6. Attention heads perform lookups by comparing queries against keys and returning associated values, while feed-forward units handle local computations. Hidden dimensions become registers, and residual addition writes results back to state after each step.

This offers an alternative to external tool use - instead of forcing models to leave their execution loop for exact computation, they could have an internal deterministic mode. It's narrower than Percepta's recent work, which compiles an interpreter into the weights. This approach compiles the target program itself, making it less general but simpler and more transparent for understanding how deterministic computation can live inside transformer blocks.