logoalt Hacker News

laroditoday at 11:03 AM3 repliesview on HN

after 25 years wikipedia showed what it truly was created for, by selling the content for training. otherwise - okay, this was a cool project, perhaps we need better. like federated, crypto-signed articles that once collected together, @atproto style, produce the article with notable changes to it.


Replies

RestartKerneltoday at 11:33 AM

Their enterprise offering is more for fresh retrieval than training. For training, you can just download the free database dump — one you would inadvertently end up recreating if you were to use their enterprise APIs in a (pre-)training pipeline.

armchairhackertoday at 11:39 AM

Context: https://arstechnica.com/ai/2026/01/wikipedia-will-share-cont...

tl;dr: Wikipedia is CC and has public APIs, but AI companies have recently started paying for "enterprise" high-speed access.

Notably, the enterprise program started in 2021 and Google has been paying since 2022.

giuliomagnificotoday at 11:07 AM

You’re saying Wikipedia was created 25 years ago to sell its content to train LLMs that didn’t even exist?! I doubt it…

show 1 reply