← Back to AI Insights
Gemini Executive Synthesis

Ktx, an open-source executable context layer for data agents. It provides business context via Markdown wiki pages and queryable definitions via YAML files (tables, row grain, joins, measures, dimensions, filters, filter groups). Ktx's planner compiles warehouse SQL, handling join paths, grain, relationships, and issues like join fanout.

Technical Positioning
Makes data agents 'reliable on your data stack' by solving the 'accuracy is the issue' with agents generating incorrect SQL. Positions itself as an improvement over traditional semantic layers by integrating unstructured business context and automating SQL generation based on defined metrics.
SaaS Insight & Market Implications
Ktx directly addresses the critical reliability challenge of data agents generating incorrect SQL, a significant impediment to their enterprise adoption. By introducing an executable context layer that combines structured queryable definitions with unstructured business context, Ktx ensures agents produce accurate, business-rule-compliant queries. This approach mitigates common errors like stale columns, join fanout, and incorrect attribution logic, which plague current AI-driven data analysis. Ktx's open-source nature and broad integration capabilities across warehouses, modeling tools, and BI platforms position it as a foundational component for robust, AI-powered data stacks. This solution capitalizes on the urgent need for trust and accuracy in AI-generated insights within data-driven organizations.
Proprietary Technical Taxonomy
open-source executable context layer data agents data stack production-grade data agents Claude Code Codex data warehouse

Raw Developer Origin & Technical Request

Source Icon Hacker News May 29, 2026
Show HN: Ktx – Open-source executable context layer for data agents

Hi HN, we’re open-sourcing ktx. It’s an executable context layer that makes agents reliable on your data stack.We built it after going through the experience of building production-grade data agents for dozens of companies. If you’ve also tried building them, or simply tried using Claude Code or Codex on your data warehouse, you’ll know that accuracy is the #1 issue. Agents are great at generating valid SQL, but it’s not always correct SQL.To cite a few examples of “agents gone wrong”:- Stale column + hidden business rule: when preparing a board report, a finance analyst asks Claude Code for “ARR by customer segment”, it derives ARR from multiple tables (subscriptions, plans, accounts), then groups by accounts.industry. But CC doesn’t know that this industry column was deprecated a few months prior, or that past board reports excluded paused subscriptions from the ARR calculation- Join fanout: a data analyst at a retailer uses their company’s internal agent to prep a product revenue deck for a QBR. The agent joins orders to order_items, then sums orders.total_amount_cents grouped by order_items.product_id. The SQL runs fine, but each order’s revenue is repeated once per line item, which most people will miss if most orders only have 1 item- Missing attribution logic: a marketing analyst asks Codex “Which campaigns drove the most revenue?” Codex joins marketing_touches to users to orders and groups by utm_campaign. But since each order can have multiple touches before purchase, the same order can be credited to first touch, last touch, every touch, or every campaign the user clicked before buying. If the agent chooses the method that doesn’t match the team’s attribution logic, they’ll make suboptimal decisionsTo solve this at first we gave the agent more context through skills + a wiki-style knowledge base. That gives it some useful extra context but still relies on it writing the SQL without incorrect assumptions.The next solution we explored was implementing a classic semantic layer. That solves the executable part, but they’re such a pain to build and maintain since they were made for legacy BI tools. Plus as a standalone tool, they lack all the useful context from unstructured data sources like internal docs.So we built ktx and split it into 2 parts:1. Business context goes in Markdown wiki pages that are auto-ingested and auto-populated2. Queryable definitions go into YAML files that define tables, row grain, joins, measures, dimensions, filters, and filter groupsThat way, when an agent needs a metric, it asks ktx for a measure, dimensions, filters, and filter groups instead of writing the whole query itself. ktx’s planner chooses the join path, uses grain and relationship metadata, catches issues like join fanout and chasm joins, and compiles the warehouse SQL, while utilizing the extra unstructured knowledge it has access to.ktx is Apache 2.0. It can ingest from most warehouses (BigQuery, Snowflake, Postgres & others), modeling tools (dbt, MetricFlow, LookML), BI tools (Looker, Metabase), doc tools like Notion, and corrections from user interactions.Install manually:npm install -g @kaelio/ktxktx setupOr give this prompt to your agent:Run npx skills add Kaelio/ktx --skill ktx and use ktx skill to install and configure ktxWe’d especially like feedback from people who’ve tried using Claude Code, Codex, or building custom agents on analytics warehouses. Where did they fail? And what did you try to make the answers more reliable?

Developer Debate & Comments

yooibox • May 29, 2026
cool
banditelol • May 29, 2026
Sounds cool, I want to try this kind of things out, but do you have or planned to have a sandboxing environment, where the agent can try running the query in let say duckdb first to confirm its validity/result before sending it over to bq? Or use something like tablesample when developing the query to reduce cost? One more thing, how do you compare with nao ( https://github.com/getnao/nao ), it's something I've followed for a while and seem to answer similar issue as what ktx build
lifeisstillgood • May 28, 2026
So, and I may be oversimplifying, you are creating awesome documents and references that I would have loved to have for my different jobs 5-10 years ago (or more).It’s just that making such docs had next to no ROI 10 years ago. But today they are the difference between success and failure.It’s fascinating - thank you(and who writes the wiki / business rules ? Can they be reverse engineered from existing query stack? )Sounds great - all the bestEdit: don’t take the above as criticism - just trying to fit new ideas into an old dog.
sid0707 • May 28, 2026
[flagged]
MadGodInc • May 28, 2026
Interesting approach. Context management for agents is an underexplored area. I've been looking at similar problems - the key challenge is keeping token usage low while giving the agent enough context to be useful. Tiered retrieval (facts first, full text only whenneeded) seems to work well in practice.
qasimkhan07 • May 28, 2026
[dead]
modus-tollens • May 28, 2026
How are you measuring the accuracy? Are you running this against any benchmarks?I see this covers a file based approach, was there ever a consideration for a graph based approach?For business context, how do you handle context that evolves over time?
tarun_anand • May 28, 2026
How does this compare with Wren 2.0, OpenVikings etc

Frequently Asked Questions

Market intelligence mapped to Ktx, an open-source executable context layer for data agents. It provides business context via Markdown wiki pages and queryable definitions via YAML files (tables, row grain, joins, measures, dimensions, filters, filter groups). Ktx's planner compiles warehouse SQL, handling join paths, grain, relationships, and issues like join fanout..

What is the technical positioning of Ktx, an open-source executable context layer for data agents. It provides business context via Markdown wiki pages and queryable definitions via YAML files (tables, row grain, joins, measures, dimensions, filters, filter groups). Ktx's planner compiles warehouse SQL, handling join paths, grain, relationships, and issues like join fanout.?
Based on our AI analysis of the original developer request, its primary technical positioning is: Makes data agents 'reliable on your data stack' by solving the 'accuracy is the issue' with agents generating incorrect SQL. Positions itself as an improvement over traditional semantic layers by integrating unstructured business context and automating SQL generation based on defined metrics.
What is the general sentiment around Ktx, an open-source executable context layer for data agents. It provides business context via Markdown wiki pages and queryable definitions via YAML files (tables, row grain, joins, measures, dimensions, filters, filter groups). Ktx's planner compiles warehouse SQL, handling join paths, grain, relationships, and issues like join fanout.?
Yes, we have tracked 14 direct responses and active debates regarding this specific topic originating from Hacker News.
What are the foundational technologies related to Ktx, an open-source executable context layer for data agents. It provides business context via Markdown wiki pages and queryable definitions via YAML files (tables, row grain, joins, measures, dimensions, filters, filter groups). Ktx's planner compiles warehouse SQL, handling join paths, grain, relationships, and issues like join fanout.?
Our proprietary extraction maps Ktx, an open-source executable context layer for data agents. It provides business context via Markdown wiki pages and queryable definitions via YAML files (tables, row grain, joins, measures, dimensions, filters, filter groups). Ktx's planner compiles warehouse SQL, handling join paths, grain, relationships, and issues like join fanout. to adjacent architectural concepts including open-source, executable context layer, data agents, data stack.

Engagement Signals

58
Upvotes
14
Comments

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like Claude Code and open-source by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.