Not only that, but all the compute spent, and hardware bought, will be worthless in 5 years.
Just the training. Training off of the internet! Filled with extremists, made up nuttery, biased bs, dogma, a large portion of the internet is stupids talking to stupids.
Just look at all the gibberish scientific papers!
If you want a hallucination prone dataset, just train on the Internet.
Over the next few years, we'll see training on encyclopedias and other data sources from pre-Internet. And we'll see it done on increasingly cheaper hardware.
This tiny branch of computer sciences is decades old, and hasn't even taken off yet. There's plenty of chance for new players.
How exactly do you foresee "pre-internet" data sources being the future of AI.
We already train on these encyclopedias, we've trained models on massive percentages of entire published book content.
None of this will be helpful either, it will be outdated and won't have modern findings, understandings. Nor will it help me diagnose a Windows Server 2019 and a DHCP issue or similar.