They investigated an open source application specifically advertising carb counting capabilities, replicated its prompts and API calls in a way optimised to collect data from 26000 queries (which is a lot to do using a GUI!). They also note other people have already done [necessarily] smaller scale studies of the commercial AI carb counting apps and been similarly unimpressed by the responses.
This is all in the first few paragraphs of a preprint paper describing the research in considerably more detail which is linked at the bottom of TFA
Meta: enjoying nearly half this HN thread being arguments that surely people care about what's in their food don't ask ChatGPT for comment instead of looking it up properly, and most of the rest of it being people who apparently care what's in a research paper asking HN for comment instead of looking it up :)