Distributed inference and multi-node clustering for DS4, specifically across multiple Apple Silicon machines. The pain point is the current single-process, Metal-only limitation preventing scaling for larger contexts or higher throughput.
Raw Developer Origin & Technical Request
GitHub Issue
May 8, 2026
I'm wondering about the possibility of running ds4 across multiple Macs (clustering / distributed inference).
Current situation:
I understand ds4 is currently single-process and Metal-only, with no built-in model sharding or multi-node support.
I have several Apple Silicon machines and would like to combine their VRAM/RAM to run larger contexts or achieve higher throughput.
Questions:
Are there any plans to add multi-node / distributed inference support in the future (even basic pipeline parallel or multi-server coordination)?
Would it be feasible to integrate ds4 with Exo (or similar tools) by running ds4-server on each machine and letting Exo treat them as backend nodes? Have you tested or considered this?
If not supported yet, do you have any recommended way to scale ds4 across multiple Macs right now?
Thanks again for the great work!
Developer Debate & Comments
No active discussions extracted for this entry yet.
Adjacent Repository Pain Points
Other highly discussed features and pain points extracted from antirez/ds4.
Engagement Signals
Cross-Market Term Frequency
Quantifies the cross-market adoption of foundational terms like distributed inference and multi-node clustering by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.
SaaS Metrics