Voxtral Realtime 4B Pure C Implementation

github.com

Voxtral Realtime 4B Pure C Implementation

github.com

cm0002 to

AI - Artificial intelligence@programming.devEnglish · 4 months ago

GitHub - antirez/voxtral.c: Pure C inference of Mistral Voxtral Realtime 4B speech to text model

github.com

Pure C inference of Mistral Voxtral Realtime 4B speech to text model - antirez/voxtral.c

Speech to text model inference in pure C.

This is a C implementation of the inference pipeline for the Mistral AI’s Voxtral Realtime 4B model. It has zero external dependencies beyond the C standard library. The MPS inference is decently fast, while the BLAS acceleration is usable but slow (it continuously convert the bf16 weights to fp32).

Audio processing uses a chunked encoder with overlapping windows, bounding memory usage regardless of input length. Audio can also be piped from stdin (–stdin), or captured live from the microphone (–from-mic, macOS), making it easy to transcode and transcribe any format via ffmpeg. A streaming C API (vox_stream_t) lets you feed audio incrementally and receive token strings as they become available.

Similar projects: Whisper.cpp

You must log in or # to comment.

Chat

AI - Artificial intelligence@programming.dev

Aii@programming.dev

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !Aii@programming.dev

AI related news and articles.

Rules:

No Videos.
No self promotion: Don’t post links to your articles.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

46 users / day
65 users / week
366 users / month
1.02K users / 6 months
3 local subscribers
295 subscribers
333 Posts
317 Comments
Modlog