← Back to AI Insights
Gemini Executive Synthesis

Fleet, an application for orchestrating and managing swarms of coding agents.

Technical Positioning
A Python orchestrator for coding agents, offering a UI for agent lifecycle management and task execution, with insights into optimizing token usage and scaling LLM interactions.
SaaS Insight & Market Implications
This submission highlights critical operational challenges in scaling LLM agent deployments. The core pain points revolve around inefficient token consumption due to poor abstraction mechanisms (CLAUDE.md, skills, indiscriminate plugin attachment) and rigid model behaviors (unmanageable system tools, lack of background session interaction). The proposed solutions—hierarchical knowledge bases, precise plugin management, and task decomposition for model routing—underscore a maturing understanding of LLM orchestration. The market implication is a clear demand for sophisticated agent management platforms that prioritize token efficiency, context control, and cost optimization. Developers require granular control over agent interactions and resource allocation to achieve scalable, cost-effective AI-driven workflows, moving beyond simplistic prompt engineering to structured, intelligent agent design.
Proprietary Technical Taxonomy
coding agents swarms Python orchestrator agent lifecycle centralized SQLite DB spawn agents dependencies coder/model

Raw Developer Origin & Technical Request

Source Icon Hacker News Jun 5, 2026
Show HN: Lessons learned from running Claude Code swarms at scale

Some time ago I built a simple app to run swarms of coding agents — I call it fleet (news.ycombinator.com/item It's based on centralized beads with a Python orchestrator and can run any coder (Claude, agy, Codex). Recently I added a UI to manage the whole agent lifecycle: adding new tasks, monitoring running ones, and a chat interface built on MCP with a centralized SQLite DB. From the UI I can spawn agents to run in any directory, define dependencies on other tasks, and specify which coder/model should do the job. Today I can run 10–15 agents concurrently. At that scale you burn through limits very fast, so I spent some time investigating where those limits go and how to maximize efficiency. Here are the lessons learned after a few weeks of running the fleet:- CLAUDE.md is a terrible abstraction. These files load unconditionally, they often contain descriptions irrelevant to the task at hand, and they stack from your working directory upward. The result is wasted tokens and confusion from injecting irrelevant instructions into the session.- Skills are bad, but not as bad as CLAUDE.md. They use a progressive disclosure approach: only the skill description goes into the session, and Claude loads the full skill text with a tool when it's needed. That's one level better, but it still doesn't let you scale — you can't create 10K skills, as that would eat your entire usable context. Claude recently introduced a skills budget that silently drops less frequently used skills from the session entirely. You can still invoke them in an interactive session, but the model can't invoke them in a background session.- Some plugins may be installed more than once. During cleanup I found that a few of mine were installed in multiple locations, consuming double the tokens on duplicated instructions.- Attaching plugins to every session is a bad idea at scale. You want to be precise about which plugins are actually useful and attach them per task.- Use a hierarchical knowledge base instead of CLAUDE.md / skills / plugins. It lets you benefit from real progressive disclosure: keep your instructions and tool descriptions in it and let Claude navigate through it quickly and cheaply.- System tools consume ~15K tokens (7% of the session). You can't manage this — they're just attached, and disabling tools doesn't remove them from the context.- AskUserQuestion isn't available in background sessions. You need to implement your own tool — MCP- or CLI-based — to give `claude -p` the ability to talk to you.- You become selective about which model handles each task. Decompose work into harder and simpler subtasks so you can route the simpler ones to weaker, cheaper models and save tokens.- Your context-switching skill improves over time.Fleet repo: github.com/sermakarevich/fle...

Developer Debate & Comments

No active discussions extracted for this entry yet.

Frequently Asked Questions

Market intelligence mapped to Fleet, an application for orchestrating and managing swarms of coding agents..

What problem does Fleet, an application for orchestrating and managing swarms of coding agents. solve?
Based on our AI analysis of the original developer request, its primary technical positioning is: A Python orchestrator for coding agents, offering a UI for agent lifecycle management and task execution, with insights into optimizing token usage and scaling LLM interactions.
Are engineers actively discussing Fleet, an application for orchestrating and managing swarms of coding agents.?
Yes, we have tracked 2 direct responses and active debates regarding this specific topic originating from Hacker News.
What architecture is tied to Fleet, an application for orchestrating and managing swarms of coding agents.?
Our proprietary extraction maps Fleet, an application for orchestrating and managing swarms of coding agents. to adjacent architectural concepts including coding agents, swarms, Python orchestrator, agent lifecycle.

Engagement Signals

8
Upvotes
2
Comments

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like context and coding agents by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.