GPT-4 is about 45 gigabytes. https://dumps.wikimedia.org/other/kiwix/zim/wikipedia/wikipe... , a recent dump of the English wikipedia, is over twice that, and that's just English. Plus AIs are expected to know about other languages, science, who even knows how much Reddit, etc.
There literally isn't room for them to know everything about everyone when they're just asked about random people without consulting sources, and even when consulting sources it's still pretty easy for them to come in with extremely wrong priors. The world is very large.
You have to be very careful about these "on the edge" sorts of queries, it's where the hallucination will be maximized.
GPT-4 was rumored to be trained on 13 trillion tokens. https://www.kdnuggets.com/2023/07/gpt4-details-leaked.html
Not sure where you’re getting the 45Gb number.
Also, Google doesn’t use GPT-4 for summaries. They use a custom version of their Gemini model family.
The issue is that it is not on the edge. It is very much the core feature of those tools, at least how it is currently sold.