HeadlinesBriefing favicon HeadlinesBriefing.com

Ultrafast FPGA inference with Kolmogorov‑Arnold Networks

Hacker News •
×

A master’s thesis from Duc Hoang, Aarush Gupta and Philip C. Harris introduces hardware architectures that deliver ultrafast inference and online learning on FPGAs using Kolmogorov‑Arnold Networks. The work earned the FPGA 2026 Best Paper award and appears in an ICML preprint, showcasing a custom LUT‑based evaluator called KANELÉ that targets sub‑microsecond latency.

GPUs excel at parallel workloads but incur scheduling and memory overhead that prevents nanosecond‑scale response. FPGAs avoid this by reconfiguring lookup tables (LUTs) and flip‑flops directly into digital logic, allowing fixed‑point quantization to represent real values as binary strings. By training lookup‑table neural networks on these primitives, designers reduce approximation error while keeping resource use minimal.

KANs replace scalar weights with learnable univariate functions expressed as linear combinations of B‑spline bases. Each edge computes a spline‑local activation, summed at the node, yielding expressive models that map naturally onto LUT structures. The thesis demonstrates that this formulation achieves inference speeds measured in nanoseconds with comparable accuracy to conventional MLPs, proving FPGAs can host practical deep‑learning workloads.

The results open a path for edge devices that require instant decisions, such as high‑frequency trading or autonomous sensor arrays, where traditional GPUs cannot meet latency constraints.