Comment on: I did get it working, with a lot of pain, if your interested here's a readme I had claud crank out capturing the gotchas.
Repo: danveloper/flash-moe by Shivox
Thanks for taking the time to write the detailed instructions.
A couple of things:
- Cleanup command `find ~/qwen35-397b-4bit -maxdepth 1 ! -name packed_experts ! -name . -exec rm -rf {} +` deleted the entire model directory on vanilla MacOS zsh ... one hour to redo the whole process.
I think it is missing a `-mindepth 1` to prevent the deletion of the parent directory.
- `expert_index.json` from this repo has model path hardcoded, so you might want to add an instruction to update the path.
- Maybe it would be better to export the model path in an environment variable, so it would be easier to copy & paste commands.
For performance figures: MBP Pro 16 with M4 Max/48GB/1TB, I'm getting around `5.5 tok/s`
GitHub Issue
Parent Entity
State: Open • Comments: 3
SaaS Metrics