logoalt Hacker News

bandramitoday at 2:40 AM0 repliesview on HN

If you say the image models don't "see" you also have to say the text models don't "read": there's a meaningful case to be made for either claim but then you're left saying "they behave as if they see" or "they behave as if they read".