← Back to AI Insights
Gemini Executive Synthesis

Architectural decision (ADR-005) for a multi-model, multi-provider, and tool strategy, addressing compatibility and routing complexities.

Technical Positioning
Establishing a robust, intelligent, and adaptable architecture for GSD2 to seamlessly integrate and manage diverse AI models and providers, ensuring tool compatibility and optimal model selection for autonomous agents. The goal is to enable agents to "work for long periods of time autonomously without losing track of the big picture."
SaaS Insight & Market Implications
ADR-005 outlines a critical architectural evolution for GSD2, moving beyond capability-aware routing to address fundamental multi-model, multi-provider, and tool compatibility challenges. The current system assumes tool compatibility, leading to potential failures with provider-specific schema limitations, differing tool call ID formats, and varied content handling. The proposed 4-step routing pipeline (Tier Eligibility → Technical Filtering → Capability Ranking → Tool Set Adjustment) and a declarative registry for provider "quirks" are essential for robust, intelligent model selection. Market implication: as AI agent systems increasingly rely on diverse LLM providers and specialized tools, managing this heterogeneity is paramount. A system that intelligently handles provider-specific nuances and ensures tool compatibility will gain a significant competitive advantage, enabling more reliable and scalable autonomous agent deployments. Inconsistencies noted in the discussion, however, indicate implementation complexities requiring careful resolution.
Proprietary Technical Taxonomy
ADR-005 Multi-Model, Multi-Provider, and Tool Strategy capability-aware model routing (ADR-004) one-dimensional complexity-tier system two-dimensional system 7 capability dimensions heterogeneous pool Tool compatibility is assumed, not verified

Raw Developer Origin & Technical Request

Source Icon GitHub Issue Mar 27, 2026
Repo: gsd-build/gsd-2
ADR-005: Multi-Model, Multi-Provider, and Tool Strategy

# ADR-005: Multi-Model, Multi-Provider, and Tool Strategy

**Status:** Draft
**Date:** 2026-03-27
**Deciders:** Jeremy McSpadden
**Related:** ADR-004 (capability-aware model routing), ADR-003 (pipeline simplification), [PR #2755](github.com/gsd-build/gsd-2/p...

## Context

PR #2755 lands capability-aware model routing (ADR-004), extending the router from a one-dimensional complexity-tier system to a two-dimensional system that scores models across 7 capability dimensions. GSD can now intelligently pick the best model for a task from a heterogeneous pool.

But model selection is only one piece of the multi-model puzzle. The system now faces a set of structural gaps that become more pressing as users configure diverse provider pools:

### 1. Tool compatibility is assumed, not verified

Every registered tool is sent to every model regardless of provider. The `pi-ai` layer normalizes tool schemas per provider (Anthropic `tool_use`, OpenAI `function`, Google `functionDeclarations`, Bedrock `toolSpec`, Mistral `FunctionTool`), but there is no mechanism to express that:

- A model may not support tool calling at all (older/smaller models, some local models)
- A provider may not support certain schema features (Google Gemini doesn't support `patternProperties`; `sanitizeSchemaForGoogle()` patches this silently)
- Some tools produce image content in results that not all models can consume
- Tool call ID formats differ across providers (Anthropic: 64-char alphanumeric; O...

Developer Debate & Comments

jeremymcs • Mar 27, 2026
Codex [P1] `ProviderSwitchReport` cannot be consumed by `before_model_select` at the point the ADR says it can. In the ADR, the report is defined after provider switching and message transformation (`Cross-Provider Conversation Continuity`), but line 240 says it should be available to `before_model_select`. That hook currently runs before scoring/selection, and even in the ADR pipeline it still fires before a concrete winning model/provider pair exists. At that point there is no single `fromApi -> toApi` switch to report yet, only a set of candidates. So this part of the design is internally inconsistent: either the hook has to move later, or it needs a different input such as per-candidate predicted switch cost instead of a realized `ProviderSwitchReport`. Relevant ADR lines: 223-240. [P1] The proposed `models.json` override path is keyed at the wrong layer and will not map cleanly onto current config semantics. The ADR example uses `providers.openai-responses.capabilities` (lines 2...
jeremymcs • Mar 27, 2026
### **Gemini ADR-005 Review: Multi-Model, Multi-Provider, and Tool Strategy** I have reviewed the proposal and its alignment with the existing routing architecture (ADR-004). This is a necessary evolution that correctly treats technical compatibility as a prerequisite for capability scoring. #### **Key Findings** * **Architectural Robustness:** The 4-step routing pipeline (**Tier Eligibility → Technical Filtering → Capability Ranking → Tool Set Adjustment**) is sound. It prevents "capability-blind" routing where a model might be highly ranked for reasoning but technically incapable of using the required tools. * **Data Continuity:** The `ProviderSwitchReport` is a critical addition. Tracking "context loss" (thinking blocks, signatures) when moving between heterogeneous providers (e.g., Anthropic to Google) is essential for long-term session health. * **Maintainability:** Centralizing provider "quirks" (schema limits, tool ID formats) into a declarative registry is a major impro...
jeremymcs • Mar 27, 2026
## ADR-005 Review: Findings and Recommendations (Revised) As Grok, built by xAI, I've reviewed ADR-005: Multi-Model, Multi-Provider, and Tool Strategy based on a deep exploration of the codebase and the ADR content. Here's my analysis and recommendations, now incorporating additional changes from a deep dive into the issue comments. ### Overall Assessment - **Architectural Soundness**: Excellent. The ADR builds logically on ADR-004's capability scoring, treating tool compatibility as a hard prerequisite filter rather than a soft score. This prevents routing to incompatible models while preserving the tiered, downgrade-safe pipeline. - **Clarity and Structure**: High. The document is well-organized with clear principles, code examples, and a phased implementation. It explicitly addresses gaps in the current system (tool assumptions, provider failover degradation). - **Feasibility**: Strong. Direct extensions to existing files (e.g., `model-router.ts` for filtering, `ToolDefinition` fo...
jeremymcs • Mar 27, 2026
## Commit: `8dc83440` — Close capability validation gaps across all dispatch paths **Branch:** `feat/provider-capability-registry` ### What was done 1. **Expanded `getRequiredToolNames()`** — from 4 to 16 unit types with accurate tool requirements (plan-*, run-uat, replan-slice, complete-*, reactive-execute, rewrite-docs, reassess-roadmap, gate-evaluate) 2. **Fixed 3 unprotected dispatch paths** that bypassed all capability checks: - `auto-direct-dispatch.ts` (`/gsd dispatch`) — now applies capability overrides + tool filtering - `guided-flow.ts` (`dispatchWorkflow`) — now filters incompatible tools after model selection - `auto.ts` (`dispatchHookUnit`) — now gets capability overrides + tool adjustment 3. **New `applyToolCompatibilityAdjustment()`** — reusable function used by all 3 dispatch paths 4. **Plan-time capability validation** — `validatePlanCapabilities()` checks task descriptions/files against model pool capabilities; `formatCapabilityConstraints()` injected i...
Sigmabrogz • Mar 28, 2026
Hi! I'd like to work on this. My understanding is that the current model routing system (ADR-004) scores models based on capabilities but assumes universal tool compatibility, which leads to silent failures and degraded context when switching providers. It lacks a declarative way to express provider constraints (like image support or schema limits) and tool requirements. I'm planning to fix it by implementing the Phase 1-3 roadmap described in the ADR: building the `ProviderCapabilities` registry in `pi-ai` to replace scattered provider-specific logic, adding `ToolCompatibility` metadata to `ToolDefinition`, and integrating the new hard-filtering step into `resolveModelForComplexity()` before the ADR-004 capability scoring. I will also build out the `ProviderSwitchReport` for cross-provider context tracking. This will be a substantial architectural lift across the core packages. Does this approach sound good?

Adjacent Repository Pain Points

Other highly discussed features and pain points extracted from gsd-build/gsd-2.

Extracted Positioning
Integration of Gherkin DSL and cryptographic locking for improved AI code generation reliability
Algorithmically reliable, spec-driven AI code generation system
Top Replies
github-actions[bot] • Mar 26, 2026
👋 Thanks for opening this issue! This was automatically flagged for maintainer review. **Flag:** Complexity without user value This proposal introduces significant architectural complexity (crypto...
igouss • Mar 26, 2026
I think is not a bad idea. > BDD (Behavior-Driven Development) is a software development approach where you define how the system should behave from the user’s perspective before writing the actual...
0mm-mark • Mar 26, 2026
> It's kind of a natural fit to describe what needs to be done to AI. Agree. And instinctively i've been interacting with AI using Gherkin habits.... But it was nice to see a formal demonstration a...
Extracted Positioning
Hardening and extending GSD-2's headless mode and JSON-RPC protocol for ecosystem integration
GSD-2 as the execution backend for the broader AI agent ecosystem (OpenClaw, MCP, CI/CD)
Top Replies
glittercowboy • Mar 25, 2026
**Overall Impression:** The proposal to solidify the headless mode and JSON-RPC protocol as a programmable surface is a highly strategic and necessary evolution. By treating GSD as an execution eng...
glittercowboy • Mar 25, 2026
Independent audit against current `main` plus the cited external surfaces. I think the direction is good, but I would not merge this ADR as written yet. There are a few baseline mismatches in the “...
glittercowboy • Mar 25, 2026
## Independent ADR Review Thorough review grounding each claim against the current codebase and external ecosystem state. --- ### Overall Assessment This is a well-structured ADR with genuine strat...
Extracted Positioning
Architectural decision to modularize GSD2's monolithic structure into shippable extensions with install infrastructure.
Evolving GSD2 from a monolithic application to a modular, extensible platform with optimized resource consumption and improved performance, enhancing its appeal as a "powerful meta-prompting, context engineering and spec-driven development system."
Top Replies
jeremymcs • Mar 28, 2026
## ADR-006 Review: Findings and Recommendations As Grok, built by xAI, I've reviewed ADR-006: Extension Modularization & Install Infrastructure based on a deep exploration of the codebase and the A...
jeremymcs • Mar 28, 2026
## Research Findings (2026-03-28) 4 parallel researchers completed — Stack, Features, Architecture, Pitfalls. Full synthesis in `.planning/research/SUMMARY.md`. ### Stack - **One new dependency:** ...
jeremymcs • Mar 28, 2026
## Implementation Plan: Extension Modularization **Full plan:** [`.plans/IMPLEMENTATION-PLAN-extension-modularization.md`](https://github.com/jeremymcs/gsd-2/blob/feat/extension-system-analysis/.pl...

Engagement Signals

8
Replies
open
Issue Status

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like declarative registry and ADR-005 by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.