← Back to AI Insights
Gemini Executive Synthesis

Speakrs, a Rust/ONNX implementation of the PyAnnotate diarization pipeline.

Technical Positioning
A significantly faster, Python-runtime-free alternative to PyAnnotate, offering 20-37x speed improvements on macOS through optimized hardware utilization (CPU, Neural Engine, GPU), with batch and fast modes.
SaaS Insight & Market Implications
This project directly addresses critical performance bottlenecks in speech processing pipelines, a key concern for B2B applications in call center analytics, meeting transcription, and voice AI. The 20-37x speed improvement on macOS, achieved by leveraging native CoreML and optimized hardware utilization, provides a substantial competitive advantage for developers targeting Apple's ecosystem. While Linux/CUDA gains are more modest, the elimination of the Python runtime simplifies deployment and reduces dependency overhead. This performance uplift translates directly into lower operational costs, faster processing times, and enhanced real-time capabilities for enterprises. The 'fast mode' offering a speed-accuracy trade-off provides crucial flexibility for diverse use cases, positioning Speakrs as a high-performance foundation for advanced audio analytics solutions.
Proprietary Technical Taxonomy
diarization pipeline Rust ONNX Runtime CoreML Python runtime segmentation powerset decode overlap-add aggregation

Raw Developer Origin & Technical Request

Source Icon Hacker News May 27, 2026
Show HN: Speakrs Full PyAnnotate pipeline in Rust/ONNX 20-37x times faster macOS

Speakrs implements the full pyannote community-1 style diarization pipeline in Rust: segmentation, powerset decode, overlap-add aggregation, binarization, embedding, PLDA, and VBx clustering.There is no Python runtime in the library path. Inference runs on ONNX Runtime or native CoreML, and the rest of the pipeline stays in Rust.It is 20x-30x faster on macOS, but only 2-3x faster on linux/cuda (depending on CPU).Few reasons its faster:1. Speakrs is using coreml versions of the models. I exported the models specifically to run on coreml. PyAnnote just runs the same the same PyTorch versions through MPS (Metal) on macOS.2. PyAnnote is not a single model, its a few different models put together in a pipeline, the readme has some info on the full pipeline.3. Speakrs optimizes the pipeline so different parts can run on CPU, Neural Engine and GPU.
Speakrs has a batch mode, where you can run on multiple files at once, doing this also lets you keep CPU/GPU/ANE all fully utilized.This is why on linux/cuda its not that much faster, PyAnnotate is already optimized to run on cuda, the speed improvements we get on cuda is by running some stuff on cpu while the other stuff runs on the GPU. The speedup on linux will depend on how powerful the CPU is.There is also a fast mode, that sacrifices some speed for accuracy, that can be up to 50x faster, and for some types of audio doesn't sacrifice that much accuracy. The benchmarks have more info on this.

Developer Debate & Comments

No active discussions extracted for this entry yet.

Frequently Asked Questions

Market intelligence mapped to Speakrs, a Rust/ONNX implementation of the PyAnnotate diarization pipeline..

What is the technical positioning of Speakrs, a Rust/ONNX implementation of the PyAnnotate diarization pipeline.?
Based on our AI analysis of the original developer request, its primary technical positioning is: A significantly faster, Python-runtime-free alternative to PyAnnotate, offering 20-37x speed improvements on macOS through optimized hardware utilization (CPU, Neural Engine, GPU), with batch and fast modes.
What are the foundational technologies related to Speakrs, a Rust/ONNX implementation of the PyAnnotate diarization pipeline.?
Our proprietary extraction maps Speakrs, a Rust/ONNX implementation of the PyAnnotate diarization pipeline. to adjacent architectural concepts including diarization pipeline, Rust, ONNX Runtime, CoreML.

Engagement Signals

2
Upvotes
0
Comments

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like Rust and GPU by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.