I do think we will, at some point, face a knowledge crisis because nobody will be willing to upload the new knowledge to the internet.
Then the LLM companies will notice, and they’ll start to create their own updated private training data.
But that may be a new centralization of knowledge which was already the case before the internet. I wonder if we are going to some sort of equilibrium between LLMs and the web or if we are going towards some sort of centralization / decentralization cycles.
I also have some hope that LLMs will annihilate the commercial web of "generic" content and that may bring back the old web where the point was the human behind the content (be it a web page or a discussion). But that what I’d like, not a forecast.
I kind of fear the same. At the same time I wonder if structured information will gain usefulness. Something like man pages are already a great resource for humans, but at same time could be used for autocompletion and for LLMs. Maybe not in the current format but in the same vein.
But longer form tutorials or even books with background might suffer more. I wonder how big the market of nice books on IT topics will be in the future. A wiki is probably in the worst place. It will not be changed with the MR like man pages could be and you do not get the same reward compared to publishing a book.
> nobody will be willing to upload the new knowledge to the internet
I think there will be differences based on how centralized the repository of knowledge is. Even if textbooks and wikis largely die out, I imagine individuals such as myself will continue to keep brief topic specific "cookbook" style collections for purely personal benefit. There's no reason to be averse to publishing such things to github or the like and LLMs are fantastic at indexing and integrating disparate data sources.
Historically sorting through 10k different personal diaries for relevant entries would have been prohibitive but it seems to me that is no longer the case.
I wouldn't be surprised if LLM companies end up sponsoring certain platforms / news sites, in exchange for being able to use their content of course.
THe problem with LLMs is that a single token (or even a single book) isn't really worth that much. It's not like human writing, where we'll pay far more for "Harry Potter" and "The Art of Computer Programming" than some romance trash with three reads on Kindle.