HeadlinesBriefing favicon HeadlinesBriefing.com

LLVM RISC-V regression fixed after 24% slowdown

Hacker News •
×

A recent commit to LLVM’s RISC‑V backend unintentionally reintroduced a performance gap with GCC on a single benchmark. The change folded a double‑precision extension into a direct unsigned‑to‑float cast, breaking a downstream narrowing step in visitFPTrunc. As a result, the generated code used fdiv.d (33‑cycle latency) instead of fdiv.s (19‑cycle latency), causing roughly 24% slower execution on SiFive P550 silicon.

Benchmarking on the SiFive P550 showed LLVM consuming about eight percent more cycles than GCC for the same test, despite similar surrounding assembly. The author used llvm‑mca to isolate the loop, confirming that only the double‑precision division differed. Comparing against an older LLVM build revealed the prior version emitted fdiv.s, matching GCC’s performance. This gap highlighted the sensitivity of floating‑point lowering passes to target‑specific latency profiles.

The patch submitted restores the narrowing optimization by extending getMinimumFPType with range analysis, allowing the compiler to collapse fptrunc(uitofp x double) to float into a direct uitofp x to float cast. This adjustment eliminates the regression and brings LLVM back in line with GCC on the SiFive target, demonstrating how subtle IR transformations can ripple into measurable latency on modern out‑of‑order CPUs and improves power efficiency as well.