Comment on: Show HN: sllm – Split a GPU node with other developers, unlimited tokens
by vova_hn2
1. Is the given tok/s estimate for the total node throughput, or is it what you can realistically expect to get? Or is it the worst case scenario throughput if everyone starts to use it simultaneously?2. What if I try to hog all resources of a node by running some large data processing and making multiple queries in parallel? What if I try to resell the access by charging per token?Edit: sorry if this comment sounds overly critical. I think that pooling money with other developers to collectively rent a server for LLM inference is a really cool idea. I also thought about it, but haven't found a satisfactory answer to my question number 2, so I decided that it is infeasible in practice.
View Discussion ↗
Discussion Thread
Parent Entity
Points: 132 • Comments: 66
Posted: Apr 4, 2026
Other Comments / Reviews
SaaS Metrics