← Back to AI Insights
Gemini Executive Synthesis

Finalrun – Spec-driven testing using English and vision for mobile apps, with open-sourced core components for test generation and vision-based execution.

Technical Positioning
A mobile app testing solution that overcomes brittle selectors and out-of-sync test flows by using vision-based agents and generating tests directly from codebase context for Android and iOS.
SaaS Insight & Market Implications
Finalrun tackles significant inefficiencies in mobile app testing by moving beyond brittle selectors and manual test maintenance. Its vision-based agent approach allows testing in plain English, improving robustness across Android and iOS. The core innovation lies in generating tests directly from codebase context, ensuring synchronization and reducing maintenance overhead, a common pain point with traditional test management. This 'post-development hand-off' where AI builds a feature and Finalrun immediately generates and executes a vision-based test demonstrates a highly automated, integrated QA workflow. For B2B SaaS, this offers a compelling solution for accelerating release cycles, improving test reliability, and reducing the cost of quality assurance, particularly in fast-paced development environments leveraging AI for code generation.
Proprietary Technical Taxonomy
Spec-driven testing English vision mobile apps brittle selectors XPath accessibility IDs vision-based agent

Raw Developer Origin & Technical Request

Source Icon Hacker News Apr 8, 2026
Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps

I wanted to test mobile apps in plain English instead of relying on brittle selectors like XPath or accessibility IDs.With a vision-based agent, that part actually works well. It can look at the screen, understand intent, and perform actions across Android and iOS.The bigger problem showed up around how tests are defined and maintained.When test flows are kept outside the codebase (written manually or generated from PRDs), they quickly go out of sync with the app. Keeping them updated becomes a lot of effort, and they lose reliability over time.I then tried generating tests directly from the codebase (via MCP). That improved sync, but introduced high token usage and slower generation.The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context.I kept the execution vision-based (no brittle selectors), but moved test generation closer to the repo.I’ve open sourced the core pieces:1. generate tests from codebase context
2. YAML-based test flows
3. Vision-based execution across Android and iOSRepo: github.com/final-run/finalru...
Demo:

the Demo video, you’ll see the "post-development hand-off." An AI builds a feature in an IDE, and Finalrun immediately generates and executes a vision-based test for it verifying the feature developed by AI.

Developer Debate & Comments

clyp • Apr 8, 2026
[dead]
usual_engineer • Apr 8, 2026
Verification of AI generated code right would be dope.We do something similar in our company for web with playwright but facing a lot of flaky tests.Will check this out
srinidhigs829 • Apr 8, 2026
I just ran my first test. Thanks team :)
ashish004 • Apr 7, 2026
Just updated README.md, it's lot simpler and addresses on the core. Thanks for the feedback, please checkout
gavinray • Apr 7, 2026
> The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context. Does the actual test code generated by the agent get persisted to project?If not, you have kicked the proverbial can down the road.
rootally7 • Apr 7, 2026
[dead]
avikaa • Apr 7, 2026
This solves a massive headache. The drift between externally generated tests and an active codebase is a brutal problem to maintain.Using vision-based execution instead of brittle XPaths is a great baseline, but moving the test definitions to live directly alongside the repo context is definitely the real win here.Did you find that generating the YAML from the codebase context entirely eliminated the "stale test" issue, or do developers still need to manually tweak the generated YAML when mobile UI layouts change drastically? Great project!
sahilahuja • Apr 7, 2026
Agentic testing. Kudos to your decision to open-source it!
arnold_laishram • Apr 7, 2026
Looks pretty cool. How does your agent understand plain english?

Frequently Asked Questions

Market intelligence mapped to Finalrun – Spec-driven testing using English and vision for mobile apps, with open-sourced core components for test generation and vision-based execution..

What problem does Finalrun – Spec-driven testing using English and vision for mobile apps, with open-sourced core components for test generation and vision-based execution. solve?
Based on our AI analysis of the original developer request, its primary technical positioning is: A mobile app testing solution that overcomes brittle selectors and out-of-sync test flows by using vision-based agents and generating tests directly from codebase context for Android and iOS.
What is the general sentiment around Finalrun – Spec-driven testing using English and vision for mobile apps, with open-sourced core components for test generation and vision-based execution.?
Yes, we have tracked 10 direct responses and active debates regarding this specific topic originating from Hacker News.
What architecture is tied to Finalrun – Spec-driven testing using English and vision for mobile apps, with open-sourced core components for test generation and vision-based execution.?
Our proprietary extraction maps Finalrun – Spec-driven testing using English and vision for mobile apps, with open-sourced core components for test generation and vision-based execution. to adjacent architectural concepts including Spec-driven testing, English, vision, mobile apps.

Engagement Signals

23
Upvotes
10
Comments

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like repo and iOS by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.