If I'm reading this correctly, it's limited to browser use, not general computer use (eg, ...

btbuildem • today at 3:23 AM • 4 replies • view on HN

If I'm reading this correctly, it's limited to browser use, not general computer use (eg, you won't be able to orchestrate KiCAD workflows with it). Not disparaging, just noticing the limitation.

I've been playing with the Qwen3-VL-30B model using Playwright to automate some common things I do in browsers, and the LLM does "reasonably well", in that it accelerates finding the right ways to wrangle a page with Playwright, but then you want to capture that in code anyway for repeated use.

I wonder how this compares -- supposedly purpose made for the task, but also significantly smaller.

Replies

MiguelG719 • today at 5:48 AM

> but then you want to capture that in code anyway for repeated use.

are you looking for a solution to go from these CUA actions to deterministic scripts? check out https://docs.stagehand.dev/v3/best-practices/caching

jillesvangurp • today at 9:39 AM

Well, you could emulate things and run them in a browser via WASM. I think it's more of a security limitation than a model limitation. In the browser they get to lean on the sand boxing model.

aargh_aargh • today at 12:41 PM

This is in my area of interest. Can you recommend any related tools/resources? Did you publish any code?

brianjking • today at 3:41 AM

Correct, this only works in the browser w/ Playwright as far as I can tell from a quick test.

alt Hacker News

Replies