IgniteMS, a batch embedding engine built with Rust and TensorRT.
Raw Developer Origin & Technical Request
Hacker News
Jun 5, 2026
Quick note on how it works and how I've done my batch embedding engine IgniteMS.The whole thing runs as one process using Rust, reading input, tokenizing, packing batches, keeping the queue full. TensorRT handles inference. Python is only as a wrapper.I built it this way because when you use more than couple of GPUs, the GPUs stop being the problem. CPU cannot feed them fast enough. One A100 can go through batches faster than Python can tokenize and feed, so the GPU just sits there idle waiting for work. Most of my time went into optimizing this. At 8 GPUs that was basically the entire challenge.On cost. I ran the big 2B messages job on a spot p4d instance (8x A100 40GB). After filtering and dedupping I got 685M raw texts. With my new engine the whole production run finishes in about half an hour. Previously I used on-demand for these jobs, now switched to spots. If AWS reclaims the box, I just rerun it. It's roughly $7 for half-an-hour run. And at least right now spots are easier to get than on-demand.Open warning: it's batch only and NVIDIA only. You can use it both as a docker image and native.
I used some optimizations for my production run. With default settings you can expect to see ~250K msg/sec if you run the benchmark script on your p4d box.
github.com/Artain-AI/ignite-... added TensorRT 11 and 60 models, 23 tested on 1x and 4x A100.Happy to share details.
Developer Debate & Comments
No active discussions extracted for this entry yet.
Frequently Asked Questions
Market intelligence mapped to IgniteMS, a batch embedding engine built with Rust and TensorRT..
What is the technical positioning of IgniteMS, a batch embedding engine built with Rust and TensorRT.?
Which technical concepts are associated with IgniteMS, a batch embedding engine built with Rust and TensorRT.?
Are developers creating tools for IgniteMS, a batch embedding engine built with Rust and TensorRT.?
Engagement Signals
Cross-Market Term Frequency
Quantifies the cross-market adoption of foundational terms like native and Rust by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.
SaaS Metrics