GitHub Issue

Parallel file-analyzer subagents can produce inconsistent node IDs and invalid complexity values

Discovered On Mar 30, 2026
Primary Metric open
## Problem When `/understand --full` dispatches parallel file-analyzer subagents, there is no deterministic enforcement of node ID format or complexity enum values. The prompt specifies the correct formats, but the assembly pipeline trusts LLM output without validation — so inconsistent batches silently corrupt the final graph. ## Issue 1: No runtime enforcement of node ID format The file-analyzer prompt (`skills/understand/file-analyzer-prompt.md`, lines 219–227) specifies: | Node Type | Required Format | Example | |---|---|---| | File | `file:` | `file:src/index.ts` | | Function | `func::` | `func:src/utils.ts:formatDate` | | Class | `class::` | `class:src/models/User.ts:User` | However, the Zod schema only validates `id: z.string()` (`packages/core/src/schema.ts`, line 13) — any string passes. Neither Phase 3 (ASSEMBLE) nor the `GraphBuilder` (`packages/core/src/analyzer/graph-builder.ts`, line 84) validates ID prefix format on merged batch output. Since subagents are LLMs writing JSON directly to `batch-.json` files, they can produce: - Project-name-prefixed IDs: `myproject:backend/main.py` - Double-prefixed IDs: `myproject:service:docker-compose.yml` - Bare paths with no prefix: `frontend/src/utils/constants.ts` **Evidence from a 226-file project run:** ```jsonc // Batch 1 — correct { "id": "file:backend/app/api/audit.py" } // Batch 4 — project name prefixed { "id": "noora-health-res-cms:backend/main.py"...
View Raw Thread