Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

47 points • by noahkay13 • today at 3:48 AM • 6 comments • view on HN

Comments

I built a C++ inference engine for NVIDIA's Parakeet speech recognition models using Axiom(https://github.com/Frikallo/axiom) my tensor library.

What it does: - Runs 7 model families: offline transcription (CTC, RNNT, TDT, TDT-CTC), streaming (EOU, Nemotron), and speaker diarization (Sortformer) - Word-level timestamps - Streaming transcription from microphone input - Speaker diarization detecting up to 4 speakers

➕ show 1 reply

antirez • today at 8:20 AM

https://github.com/antirez/qwen-asr

https://github.com/antirez/voxtral.c

Qwen-asr can easily transcribe live radio (see README) in any random laptop. It looks like we are going to see really cool things on local inference, now that automatic programming makes a lot simpler to create solid pipelines for new models in C, C++, Rust, ..., in a matter of hours.

ghostpepper • today at 4:38 AM

Off topic but if anyone is looking for a nice web-GUI frontend for a locally-hosted transcription engine, Scriberr is nice

https://github.com/rishikanthc/Scriberr

nullandvoid • today at 7:36 AM

I've been using handy with parakeet on both Windows and mac, and have been very impressed.

Hoe does this compare?

alt Hacker News

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

Comments