There's an incredibly serious lack of education with how LLMs & carb-counting works. This e...

endymion-light • today at 12:59 PM • 33 replies • view on HN

There's an incredibly serious lack of education with how LLMs & carb-counting works. This entire article would be better suited to astrology.com than hackernews.

When I opened it up, I assumed the author would have at least attempted a calculation service, maybe even placed something like the size of the meal into an actual model, using the integration of pre-existing tools that are (slightly more) accurate. Hell - most food literally is required to have calorie information, and you can query open source data for others!

But the author just took pictures of food & expected a realistic response? Is this genuinely what amounts to a study in AI?

This is akin to the instagram reels that talk to chatGPT and ask it to time how long they're run is. Except those are treated as funny jokes rather than being turned into studies.

I'd like to see this study done using any kind of actual grounding knowledge, seeing what mistakes AI makes when attempting to query ground truth from picture analysis - there would at least be an interesting result methodology in that.

Replies

kalleboo • today at 1:32 PM

> But the author just took pictures of food & expected a realistic response?

There are very popular apps on the App Store right now that are going viral among non-techie people that do exactly this, and they have no concept of how AI works. My wife was talking about one and I had to give her a reality check that the AI had no idea what ingredients were used to make the food. And she's a licensed nutritionalist.

Studies like this create something to point at for people who are confused and serve as a springboard for a conversation in the media.

➕ show 4 replies

andrewvc • today at 2:08 PM

One of the biggest gaps is that people don't understand that food labels are allowed by the FDA to be off by up to 20% in terms of the number of actual calories!

In the real world you need to calibrate your behavior with the results. Are you gaining weight? You'll need to eat less if you want to lose any. You can do all the math with nutrition labels and macros you want but that's all theoretical.

See this study below for the 20% figure, as well as their experimental results on real food items (some even exceeded this threshold though most were within it). https://pmc.ncbi.nlm.nih.gov/articles/PMC3605747/?st_source=...

➕ show 7 replies

furyofantares • today at 1:10 PM

From the text of the article I believe the author is implying there are apps doing exactly this, and so this is why it was studied that way.

Had the author written the article themselves rather than an LLM their motivation probably would have been clearer.

➕ show 2 replies

throwaw12 • today at 1:12 PM

I feel like you didn't understand the goal of this study

> The DTN-UK stated earlier this year that generic LLMs must never be used as autonomous advisory calculators for insulin delivery. This data is the quantitative evidence base for that statement.

This study is to prove that you should not rely on LLMs

➕ show 4 replies

Aurornis • today at 1:25 PM

> But the author just took pictures of food & expected a realistic response? Is this genuinely what amounts to a study in AI?

The article explains this: There are apps targeting people with diabetes that claim to count your carbs with AI.

> If you’re using AI carb counting in a diabetes app

Before you dismiss a study, try to understand where it’s coming from.

The authors of the study weren’t stupid. They knew the LLMs would provide poor results. They ran the study to quantify it and create a resource to spread the information in response to the rise of AI carb counting apps.

➕ show 3 replies

coldtea • today at 1:35 PM

>But the author just took pictures of food & expected a realistic response? Is this genuinely what amounts to a study in AI?

If there are commercial services where you take pictures of food and are promised a realistic (paid for) response, then yes. And there are.

➕ show 2 replies

swalsh • today at 1:08 PM

It amazes me how much people try to build AI systems relying on nothing more than the models knowledge. I suspect a great deal of "failed" AI experiments we keep reading are people just not having any idea how to use AI at what its good at.

datsci_est_2015 • today at 2:16 PM

The obvious meme to invoke here is:

  - AI will solve all of our problems 
  - No not like that!

Are the trillion dollars sloshing around the AI economy well-invested if the refrain is always “you’re holding it wrong”?

So we’re trying to define, through trial and error, what problems “AI” will actually solve, and this paper is one of the many cobblestones on that road.

➕ show 1 reply

nextlevelwizard • today at 1:02 PM

As someone who used to do this. OpenAI models refuse to look up calories unless you explicitly tell them to and even then it is a hit and miss even if you tell them exactly what the product is. Easiest way to get good calculation is to just take a photo of the nutrition label or feed that info in by hand.

Funny thing is 4o did look up calories but I guess it was too good for this world

➕ show 1 reply

jvanderbot • today at 2:01 PM

I did this too! For months (almost a year) I used descriptions, pictures, and measurements of food to get rough calorie counts. My diet is pretty simple and repetitive.

I would occasionally check the estimates, maybe once every few days for meals I wasn't already pretty sure of, and it was generally accurate. Where it was extremely inaccurate was on portions, and anyone who has dealt with computer vision could tell you, you can't get scale from a picture. So I'd have to weigh some meals or ingredients, which would generally make things more accurate again.

So, I think it's possible, but you need multimodal data and grounded with regular checks.

➕ show 1 reply

giancarlostoro • today at 1:20 PM

> But the author just took pictures of food & expected a realistic response? Is this genuinely what amounts to a study in AI?

Reminds me of that one youtube video (I forget who it is so I have no idea how to pull it up) where he turns on the camera on his phone for ChatGPT and asks it what everything it sees weighs, then puts it on a scale, and ChatGPT was never right, ever, which makes sense, I couldnt tell you what most things weigh on sight alone either, but ChatGPT often got it dramatically off. I got the feeling he thought it was terrible AI for this, but I don't think a model looking at an image of something and trying to guess its weight / calories / etc... is a reason to call an AI model bad...

➕ show 1 reply

whstl • today at 3:34 PM

> But the author just took pictures of food & expected a realistic response?

You say this, but I know of quite a few companies in this area, including a couple accelerated by YCombinator, and that's pretty much 100% of what they do in their backend.

mathgradthrow • today at 3:46 PM

a realistic response? What's a realistic response to "how many calories are in an avocado?"

If you are counting calories, you don't want the answer to "how many calories are in the average avocado?", you want to know how many calories are in this avocado. Remember that bodyweight is roughly linear with BMR, so a 10% error in calorie counting is an extra 10% of bodyweight.

something765478 • today at 1:37 PM

> The prompt I used asks each model to return a confidence score (0 to 1) for every food item it identifies. All four models dutifully returned confidence scores for 100% of items. Surely we can use those to filter out bad estimates?

This is a problem with the companies selling the AI models, not the customers. It is their responsibility to inform consumers about the limits of their services, and to train the models to say "I don't know, there is not enough information".

toasty228 • today at 2:55 PM

> But the author just took pictures of food & expected a realistic response?

There are dozens of ios/android apps with 100-300k+ ratings and god knows how many millions of installs which do exactly this

"Cal AI - Food Calorie Tracker: Just snap a photo and our smart AI calorie tracker analyzes your meal instantly."

308k ratings on ios, 264k ratings on androids easily 5-10m installs across both platforms.

layer8 • today at 1:54 PM

The author is doing what a non-sophisticated user would be doing, or would want to be able to do, and estimating calories based on a photo has been an often-cited potential or promised AI use case in recent years years. It makes a lot of sense to test current general-purpose AI’s performance on it as a reality check.

It also exemplifies how current AI offerings are still quite limited in their capabilities, because one would expect that they’d do the intelligent thing on their own that you had expected, instead of the user having to come up with a working methodology.

InsideOutSanta • today at 1:20 PM

There are apps in the app store right now that pretend to do this kind of thing, so having somebody actually show that it doesn't work is valuable, even if we already knew the outcome ahead of time.

➕ show 1 reply

zipy124 • today at 1:18 PM

Honestly it's scary how misunderstood this is by the general public, the media and EVEN scientists.

There is a shocking amount of Computer Vision tasks where the scientists claim you can get X info from a picture of Y and it's like, even with ML/AI you can't extract data where there isn't any. The fact I can add an arbritrary amount of high-calorie fat to a meal without changing the appearance by defintion shows it's pointless. A 1000 calorie and 100 calorie milkshake can look identical, and you'd have no way of working that out via an image even if it was a super-intelligent system.

Similarly I see it in things like extracting material of an object from an image of it in serious research papers, which for the same reason cannot be done, since how an object looks has very little to do with what its made of, else painting and other art would clearly be impossible. The information is just not there within the data.

➕ show 1 reply

larodi • today at 1:40 PM

> This entire article would be better suited to astrology.com than hackernews.

I laughed, but you nailed it. Sadly so many people lack even basic understanding of LLMs and the ViT tower that makes it vLLM, that I expect a whole industry, similar to fortune telling, to emerge out of it.

winddude • today at 2:14 PM

as a t-1 diabetic, this is exactly what we do for nearly every meal we eat, especially in restaurants, look at it and try to estimate the number of carbs.

macleginn • today at 1:37 PM

It doesn't really matter if the model cannot make a good educated guess about calories in the food if it cannot give a consistent response given the same input.

chromacity • today at 2:31 PM

> There's an incredibly serious lack of education with how LLMs & carb-counting works

Oh! Do the vendors offer trainings to make sure the users understand how LLMs work? If not, surely, the LLM itself is trained to know its limitations and politely decline in situations like that?...

The #1 use case for this tech is "here's a problem I don't feel like solving, let's have a computer do magic". It's how it's advertised on TV, it's how it promoted in the software I already use. Food preparation? Travel planning? Shopping? Tutoring your children? You can do anything now!

I just talked to a realtor who will make a killing on a real estate transaction. Instead of offering human insights, they sent me "AI reviews" of several properties. The AI has never been to any of these properties and has no idea how they actually look like. But I guess it's how we operate now as a society.

If you go to eBay, every other listing description for used items is AI-generated. This is an official platform feature for sellers. The AI doesn't know the condition of the item or what's included or missing. Doesn't matter, it's magic. It's AGI, it will figure it out.

Most of the uses of AI I encounter as a consumer are like that, and the companies selling this tech are 100% complicit.

➕ show 1 reply

ilivethere • today at 1:26 PM

> But the author just took pictures of food & expected a realistic response?

Outside our tech-enabled bubble, there are folks who have been sold the idea that ChatGPT et al is a miracle worker capable of replacing dieticians, gym coaches, psychologists, etc.

So it's VERY plausible to believe that there are folks out there snapping pics of their meals and asking GPT to spit out nutritional values.

➕ show 1 reply

jrm4 • today at 3:43 PM

This strikes me as a good "meta" article, though. As in, yes, people here probably don't need this. But perhaps a lot of other people do.

slumpt_ • today at 3:40 PM

This is how a lot of regular people are engaging with AI, whether you consider it silly or not.

sleepybrett • today at 3:34 PM

> There's an incredibly serious lack of education with how LLMs & carb-counting works. This entire article would be better suited to astrology.com than hackernews.

This is because the people who promote these technologies, and the companies that sell these technologies, engage in a massive amount of puffery (aka hyperbolizing aka just straight telling lies).

These technologies are painted as the magical solution to whatever problem you have (all it costs you is a few tens of thousands of tokens, aka your water supply). There is literally nothing they CAN'T do if you will just let us build these gigantic small town destroying, noise polluting, water and electricity hungry 'AI data-centers'. So that we can use those datacenters to sell you more tokens to put into their slot machines.

YeGoblynQueenne • today at 2:44 PM

>> But the author just took pictures of food & expected a realistic response? Is this genuinely what amounts to a study in AI?

The aim of the study was to understand the variation in results returned by models and how that could cause risks for patients using those models. The main result was measuring within-model variation.

From the pre-print (https://www.diabettech.com/wp-content/uploads/2026/04/diabet...):

We aimed to characterise the within-image reproducibility of carbohydrate estimates from four large language model (LLM) vision APIs and to quantify the clinical risk for insulin dosing, stratifying accuracy by reference value quality.

Methods

Thirteen food photographs were each submitted 495–561 times to four LLM vision APIs (GPT-5.4, Claude Sonnet 4.6, Gemini 2.5 Pro, Gemini 3.1 Pro Preview) using an identical structured prompt adapted from the iAPS automated insulin delivery system (26,904 total queries, temperature 0.01). The primary outcome was within- image variation (coefficient of variation [CV], range, distributional normality). Secondary outcomes included accuracy against reference values for nine images, stratified by quality tier (packet label, weighed/measured, portioned, or visual estimate). Clinical risk was translated at an insulin-to-carbohydrate ratio of 1:10.

>> I'd like to see this study done using any kind of actual grounding knowledge, seeing what mistakes AI makes when attempting to query ground truth from picture analysis - there would at least be an interesting result methodology in that

The ground truth was established by the author. There's an appendix in the pre-print (Appendix I) that describes the methodology. Methods are described in page 4 of the pre-print:

Reference values for accuracy analysis

For nine of the thirteen images, the author estimated the carbohydrate content using methods described in Appendix 1. Reference quality was categorised into four tiers:

Tier 1 (packet label): Carbohydrate values derived from manufacturer nutrition labelling. Two images (cheese sandwich, soup with bread) used bread with labelled carbohydrate content of 20 g per slice.

Tier 2 (weighed/measured): Portions directly weighed and cross-referenced with established composition data. Three images (Bakewell tart, bakery cookie, breakfast burrito).

Tier 3 (portioned): Portions estimated by the author (not weighed) and combined with USDA composition data. Three images (roast dinner, chilli con carne with rice, stuffed pork loin).

Tier 4 (visual estimate): Portions and composition estimated from visual inspection. One image (churros).

For the four restaurant dishes (pizza capricciosa, eggs benedict, crema catalana, paella), no reference value was established. These images were used for the primary reproducibility analysis only.

Carbohydrate values follow the EU convention with dietary fibre excluded.

black6 • today at 2:07 PM

> There's an incredibly serious lack of education with how LLMs & carb-counting works

The public's education comes from the incessant marketing from AI companies that their models are the panacea for everything.

jmyeet • today at 1:50 PM

> But the author just took pictures of food & expected a realistic response?

If someone sent me a picture of a meal and asked me what the macros were or how many carbs this is, I would say "I can't tell from a photo. Nobody can". The problem is that current LLM chatbots don't seem to have a concept of telling you "I don't know", "you can't do that" or even "you're wrong".

You can say that somebody shouldn't trust an LLM for this but it's going to be a problem that LLMs give nonsencial answers. What I find particularly amusing is that there are still technical people (generally, not anyone specifically) who seem unable to acknowledge that LLMs hallucinate and lie.

There was a post on here recently that I couldn't find with some quick searching but the premise basically was that chatbots were trained like neurotypical people: A lot of affirmation and basically lying. Separately someone else characterized this NT style of communication as "tone poems" [1]. I keep thinking about that because to me that's so accurate.

Dunning-Kruger is a common refrain on HN, for good reason. Another way to put this is how often people are confidently wrong. I really wonder if this is an inevitable consequence of NT communication because most neurodivergent ("ND") people I know are incredibly intentional in what they say and mean.

[1]: https://news.ycombinator.com/item?id=47832952

sarusso • today at 1:28 PM

[dead]

throwaway613746 • today at 1:23 PM

[dead]

SirMaster • today at 2:41 PM

How about instead of blaming the user for not understanding how AI works, the AI makers stop letting their chatbots answer questions so confidently that they clearly can't answer...

If I ask the AI about some health issue, it says something along the lines of warning I'm not a doctor etc. So if I show it a picture and ask it to tell me the carbs, how about a warning telling me it can try, but that it probably wont be very accurate.

alt Hacker News

Replies