ROIpad ← Back to Search
github.com › repository issue

block parameter N

MoonshotAI/Attention-Residuals
Status: Open
Opened: Mar 25, 2026
just curious have you guys tried effects of varying block sizes within a single model (such as using smaller groups in earlier layers and larger groups in later layers)
Unknown
View on GitHub ↗
Related Content