I have a similar finding for a website I made that collates college town bar specials and live music. Using agents with vision models works but it's not as straightforward as one would initially think. U can check out the results here. https://www.nittanynights.com