HeadlinesBriefing favicon HeadlinesBriefing.com

RCLI: On-Device Voice AI Beats Cloud APIs on Apple Silicon

Hacker News •
×

RunAnywhere AI has open-sourced RCLI, an on-device voice AI pipeline that outperforms cloud APIs on Apple Silicon. The tool combines speech-to-text, LLM inference, and text-to-speech into a single 43 macOS actions workflow running entirely locally with sub-200ms latency.

Built on MetalRT, a custom GPU inference engine, RCLI achieves breakthrough speeds: 658 tokens/second on Qwen3-0.6B versus 295 tokens/second for llama.cpp, and 101ms transcription for 70 seconds of audio versus 465ms for mlx-whisper. The engine uses custom Metal shaders and pre-allocated memory to eliminate framework overhead.

RCLI demonstrates that local AI can match cloud performance for voice applications. The tool includes 20+ models, local RAG over documents, and a terminal dashboard for model management. Available via Homebrew or direct download, it requires macOS 13+ on Apple Silicon M3 or later, with M1/M2 support coming soon.