Comment on: Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs

by imranq

Posted: Mar 10, 2026

Amazing write up and i wish more people showed the process for discovery which is often even more interesting than the result itselfStill the result is really interesting being able to stack abstract reasoning and get better performance and the heat maps to show the prob resultsThe academic literature seems to be catching up:- *[SOLAR / DUS (Kim et al., 2023)](https://arxiv.org/abs/2312.15166)* — duplicated transformer layers to build a 10.7B model that outperformed 30B parameter baselines.- *[The Curse of Depth (2025)](https://arxiv.org/abs/2502.05795)* — explains why this works: Pre-LN causes deep transformer layers to converge toward identity functions, meaning middle layers are where real computation happens, and duplicating them concentrates that capacity.- *[Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (Geiping et al., NeurIPS 2025)](https://arxiv.org/abs/2502.05171)* — takes the idea to its logical conclusion: a model trained with a single recurrent block repeated at inference time, scaling reasoning depth without adding parameters.

View Discussion ↗

Discussion Thread

Parent Entity

Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs

Points: 458 • Comments: 120

Posted: Mar 10, 2026

Other Comments / Reviews

Great work and love the detailed breakdown. This is kind ...

by vjsrinivas Mar 13, 2026
By far one of the most interesting blogs I’ve read in a l...

by BrownSol Mar 13, 2026
I'm surprised the point/comment ratio is this s...

by momojo Mar 10, 2026
The astounding thing about Goliath wasn’t that is was a h...

by mysteria Mar 10, 2026
I find the concept of LLM "brain surgery" fasci...

by iamjackg Mar 10, 2026