logoalt Hacker News

visioninmybloodtoday at 4:24 AM0 repliesview on HN

you want get the exact coordinated by running a key point network to pinpoint which coordinates does the next click point is you can. here I show a example simple prompt which returns the keypoint location of the next botton to click and visually localize the point with a keypoint in the image

https://chat.vlm.run/c/e12f0153-7121-4599-9eb9-cd8c60bbbd69