logoalt Hacker News

simianwordstoday at 7:31 PM0 repliesview on HN

> To some extent. It's not clear where specifically the boundaries are, but it seems to fail to approach problems in ways that aren't embedded in the training set. I certainly would not put money on it solving an arbitrary logical problem.

In what way can you falsify this without having the LLM be omniscient? We have examples of it solving things that are not in the training set - it found vulnerabilities in 25 year old BSD code that was unspotted by humans. It was not a trivial one either.