Not to be a luddite, but large language models are fundamentally not meant for tasks of this nature. And listen to this:
> Most notably, it provides confidence levels in its findings, which Cheeseman emphasizes is crucial.
These 'confidence levels' are suspect. You can ask Claude today, "What is your confidence in __" and it will, unsurprisingly, give a 'confidence interval'. I'd like to better understand the system implemented by Cheeseman. Otherwise I find the whole thing, heh, cheesy!
> large language models are fundamentally not meant for tasks of this nature
There should be some research results showing their fundamental limitations. As opposed to empirical observations. Can you point at them?
What about VLMs, VLAs, LMMs?
Can't LLMs be fed the entire corpus of literature to synthesise (if not "insight") useful intersections? Not to mention much better search than what was available when I was a lowly grad...
Finding patterns in large datasets is one of the things LLMs are really good at. Genetics is an area where scientists have already done impressive things with LLMs.
However you feel about LLMs, and I say this because you don't have to use them for very long before you witness how useful they can be for large datasets so I'm guessing you're not a fan, they are undeniably incredible tools in some areas of science.
https://news.stanford.edu/stories/2025/02/generative-ai-tool...
I made a toy order item cost extractor out of my pile of emails. Claude added confidence percentage tracking and it couldn't be more useless.
This is what Yan Le Cun means when he talks about how research is at a dead end at the moment with everyone all in on LLMs to a fault
I've spent the last ~9 months building a system that, amongst other things, uses a vLLM to classify and describe >40 million house images of number signs in all of Italy. I wish I was joking, but that aside.
When asked about their confidence, these things are almost entirely useless. If the Magic Disruption Box is incapabele of knowing whether or not it read "42/A" correctly, I'm not convinced it's gonna revolutionize science by doing autonomous research.