>But the author just took pictures of food & expected a realistic response? Is this genuinely what amounts to a study in AI?
If there are commercial services where you take pictures of food and are promised a realistic (paid for) response, then yes. And there are.
But I don't see them using those commercial services in this study - instead, they're using frontier model companies? Is Gemini advertising that you get a realistic calorie count from a picture? Maybe so - in which case i'd take it back!
And what’s the variance & accuracy of their responses? Isn’t comparing the models’ variance to baseline human variance what matters here? It seems like they didn’t do that, and I agree with parent’s call for that kind of baseline.
Having counted calories for years, I don’t think I could reliably estimate the calories or carbs in the example picture of a cheese sandwich. I can make assumptions about the bread and the cheese, but I might easily be off by 2-3x. Calorie counting apps that use text descriptions also have huge variance for the same thing. The problem might be the belief that a picture or description is enough, regardless of who or what is guessing…?
Edit: Ah, I see from sibling thread you meant commercial services are LLMs, I thought you meant there were human-backed services to compare to. Anyway, I totally agree there’s a problem if people rely on AI for safety, but I’m not sure LLMs are the core issue here, it seems like using vague information and guessing is the core issue.