logoalt Hacker News

arjunchintyesterday at 10:40 PM1 replyview on HN

Pretty doubtful about computer use/screenshotting based approaches.

With Retriever AI, we construct custom accessibility trees to represent web pages and just switched over to using DeepSeek v4 Flash and its nearing 100x cost decrease.

We also had great success just reverse engineering the underlying APIs of websites and then writing code to hit them. This approach of using screenshots to take actions on a webpage to trigger the underlying network calls the website is making seems too naive.


Replies

infectoyesterday at 10:50 PM

Reverse engineering APIs is just a recipe to get blocked sooner. Good luck!