Comment on: Show HN: sllm – Split a GPU node with other developers, unlimited tokens

by vova_hn2

Posted: Apr 4, 2026

1. Is the given tok/s estimate for the total node throughput, or is it what you can realistically expect to get? Or is it the worst case scenario throughput if everyone starts to use it simultaneously?2. What if I try to hog all resources of a node by running some large data processing and making multiple queries in parallel? What if I try to resell the access by charging per token?Edit: sorry if this comment sounds overly critical. I think that pooling money with other developers to collectively rent a server for LLM inference is a really cool idea. I also thought about it, but haven't found a satisfactory answer to my question number 2, so I decided that it is infeasible in practice.

View Discussion ↗

Discussion Thread

Parent Entity

Show HN: sllm – Split a GPU node with other developers, unlimited tokens

Points: 132 • Comments: 66

Posted: Apr 4, 2026

Other Comments / Reviews

Interesting concept. One thing I’m curious about if I’m i...

by spencer9714 Apr 4, 2026
Interesting there's a trickle of low intensity job o...

by avereveard Apr 4, 2026
Interesting direction. One adjacent pattern we've be...

by tensor-fusion Apr 4, 2026
Do you own the GPUs or are you multiplexing on a 3rd part...

by p_m_c Apr 4, 2026
> How does billing work?> When you join a cohort, y...

by QuantumNomad_ Apr 4, 2026