HeadlinesBriefing favicon HeadlinesBriefing.com

80386 Early Start Unlocks FPGA Performance Gains

Hacker News •
×

Intel's 80386 processor introduced Early Start to mask memory latency, a trick that boosted performance by 9% but also birthed the infamous POPAD bug. This architectural quirk allowed the 386 to overlap address calculations for the next instruction with the final cycle of the current one, a nuance buried in its microcode. The z386 FPGA core, initially lacking Early Start, now replicates this behavior through hardware logic, achieving ao486-class performance without altering the board's 85 MHz clock. Performance leaps are evident: core Doom FPS surged 39% (16.6 to 23.0), surpassing ao486 benchmarks, while 16-bit 3DBench scores now edge past ao486 too.

The z386's success stems from two key optimizations. First, early-start is implemented by computing effective addresses during the previous instruction's delay slot, using a forwarding network to handle data hazards. This mirrors the 386's microcode but requires precise cycle timing. Second, the memory pipeline was tightened: a 3-entry store queue now releases delays earlier, and cache accesses consolidate TLB lookups into the first instruction cycle. These changes reduced cycles per instruction (CPI), enabling more work per clock. The design also includes an early branch redirect, bypassing 386-style cycle-accuracy for speed—a pragmatic shift toward functional correctness over emulation.

The POPAD bug highlights Early Start's risks. When POPAD (which alters ESP) precedes a memory access using EAX, the forwarding network may incorrectly propagate old values. z386 reproduces this flaw, adhering to 386 behavior rather than fixing it. Architecturally, Early Start represents coarse pipelining—overlapping macro-instruction cycles rather than fine-grained instructions. This trade-off between performance and complexity is why Intel prioritized it at 9% gain. For retro computing, z386 proves Early Start's principles can be adapted to modern FPGAs, though its historical baggage remains.

z386's achievements matter for enthusiasts and engineers alike. By reviving 1980s techniques, it demonstrates how hardware optimizations can close performance gaps without modern hardware. The 39% FPS boost in Doom isn't just nostalgia—it shows that clever design can rival newer architectures. However, the POPAD bug serves as a caution: Early Start's efficiency comes with subtle, hard-to-reproduce flaws. For those building retro systems, z386 offers a blueprint for balancing performance and accuracy, though modern CPUs would still outpace it. The core lesson? Sometimes, doubling down on old ideas with new tools yields surprising results.