logoalt Hacker News

jofzartoday at 8:56 AM1 replyview on HN

I'm going the opposite of everyone else is saying.

This is sick OP based on what's in the document, it looks really useful when you need to quickly fix something and need to validate the changes to make sure nothing has changed in the UI/workflow except what you have asked.

Also looks useful for PR's, have a before and after changed.


Replies

jillesvangurptoday at 9:24 AM

Exactly. We need more tools like this. With the right model, picking apart images and videos isn't that hard. Adding vision to your testing removes a lot of guess work from ai coding when it comes to fixing layout bugs.

A few days ago I had a interaction with codex that roughly went as follows, "this chat window is scrolling off screen, fix", "I've fixed it", "No you didn't", "You are totally right, I'm fixing it now", "still broken", "please use a headless browser to look at the thing and then fix it", "....", "I see the problem now, I'm implementing a fix and verifying the fix with the browser", etc. This took a few tries and it eventually nailed it. And added the e2e test of course.

I usually prompt codex with screenshots for layout issues as well. One of the nice things of their desktop app relative to the cli is that pasting screenshots works.

A lot of our QA practices are still rooted in us checking stuff manually. We need to get ourselves out of the loop as much as possible. Tools like this make that easier.

I think I recall Mozilla pioneering regression testing of their layout engine using screenshots about a quarter century ago. They had a lot of stuff landing in their browser that could trigger all sorts of weird regressions. If screenshots changed without good reason, that was a bug. Very simple mechanism and very effective. We can do better these days.