logoalt Hacker News

Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps

21 pointsby ashish004today at 2:33 PM7 commentsview on HN

I wanted to test mobile apps in plain English instead of relying on brittle selectors like XPath or accessibility IDs.

With a vision-based agent, that part actually works well. It can look at the screen, understand intent, and perform actions across Android and iOS.

The bigger problem showed up around how tests are defined and maintained.

When test flows are kept outside the codebase (written manually or generated from PRDs), they quickly go out of sync with the app. Keeping them updated becomes a lot of effort, and they lose reliability over time.

I then tried generating tests directly from the codebase (via MCP). That improved sync, but introduced high token usage and slower generation.

The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context.

I kept the execution vision-based (no brittle selectors), but moved test generation closer to the repo.

I’ve open sourced the core pieces:

1. generate tests from codebase context 2. YAML-based test flows 3. Vision-based execution across Android and iOS

Repo: https://github.com/final-run/finalrun-agent Demo: https://youtu.be/rJCw3p0PHr4

In the Demo video, you’ll see the "post-development hand-off." An AI builds a feature in an IDE, and Finalrun immediately generates and executes a vision-based test for it verifying the feature developed by AI.


Comments

gavinraytoday at 6:21 PM

  > The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context.
Does the actual test code generated by the agent get persisted to project?

If not, you have kicked the proverbial can down the road.

show 1 reply
avikaatoday at 5:45 PM

This solves a massive headache. The drift between externally generated tests and an active codebase is a brutal problem to maintain.

Using vision-based execution instead of brittle XPaths is a great baseline, but moving the test definitions to live directly alongside the repo context is definitely the real win here.

Did you find that generating the YAML from the codebase context entirely eliminated the "stale test" issue, or do developers still need to manually tweak the generated YAML when mobile UI layouts change drastically? Great project!

show 1 reply
sahilahujatoday at 5:01 PM

Agentic testing. Kudos to your decision to open-source it!

show 1 reply
arnold_laishramtoday at 2:57 PM

Looks pretty cool. How does your agent understand plain english?

show 1 reply
rootally7today at 5:59 PM

[dead]

arbaaztoday at 3:19 PM

[dead]