Theoretically:
- A queryable large vector database containing calorie counts for specific meals.
- A vision model specifically trained on food images with labelled data containing approximate calorie counts.
- OCR model allowing reading of barcodes + calorie information.
A model trained to ask for additional context & information (e.g for pasta - please provide a photo of the original sauce tin/ect), (please approximate the weight of X meat)
I don't know how accurate integrating all of those aspects would be - and you could argue the end user would probably be incredibly annoyed and it wouldn't be a good app - but I'd argue you'd at least need that if you're developing an app for diabetes management.