logoalt Hacker News

theturtletalkstoday at 12:09 AM0 repliesview on HN

I’d also checkout midscene, you can set the model and UI-TARS works but you can also use qwen vision models and it works.