HeadlinesBriefing favicon HeadlinesBriefing.com

yapsnap lets you transcribe videos locally with a single command

Hacker News •
×

Developers can now turn any YouTube, TikTok or Instagram Reel into a plain‑text transcript using the new yapsnap CLI. The tool fetches the video with yt-dlp, decodes it via ffmpeg, then runs an 80 MB Kroko English Zipformer model entirely on the CPU. First‑time use downloads the model once; subsequent runs stay offline.

Because it relies only on three Python dependencies—sherpa‑onnx, numpy and yt‑dlp—setup is minimal: install ffmpeg, pip install the repository and run a single command. Users can request sentence‑level timestamps, adjust preprocessing speed, or keep the extracted audio. Output lands in a UTF‑8 *.txt* file alongside the source or in a custom folder.

The approach sidesteps GPU requirements and cloud APIs, meaning no API keys, quotas, or data leaving the machine. By streaming audio to the transducer in real time, yapsnap finishes faster than the source media plays, even at 1.5× speed. It offers a privacy‑first, low‑resource alternative for developers needing quick, local speech‑to‑text.