Question about Helios-Base speed in Table 3
PKU-YuanGroup/Helios
Hi, thanks for the great work! I have a question about the speed comparison in Table 3.
In Table 3, Helios-Base (14B) achieves **0.54 FPS** while Wan 2.1 14B achieves **0.33 FPS**. However, I'm confused about it since:
1. **Helios-Base uses 50 sampling steps** (as stated in Section 5.1: "For Stages 1–2, we adopt UniPC scheduler with 50 sampling steps"), which is the same as the original Wan 14B.
2. **Multi-Term Memory Patchification** is designed to compress the historical context XHist. But for pure T2V tasks (where XHist = all zeros, as mentioned in Section 3.1.1: "if XHist is all zeros, the model performs T2V"), there's no history to compress.
**My questions:**
1. Was the 81-frame benchmark in Table 3 evaluated using **autoregressive chunk-by-chunk generation** (like 9 frames per chunk) or **single-pass bidirectional generation**?
2. If it was autoregressive generation, how many frames were generated per chunk? And what's the actual token count reduction from Multi-Term Memory Patchification?
3. If it was single-pass generation, then what caused Helios-Base to be faster than Wan 14B? The token compression should only work when there's actual history context.
Thanks for your attention !
View on GitHub ↗
Related Content
SaaS Metrics