logoalt Hacker News

lsyyesterday at 8:53 PM0 repliesview on HN

This seems simplistic, tech and infrastructure play a huge part here. A short and incomplete list of things that contributed:

- Moore's law petering out, steering hardware advancements towards parallelism

- Fast-enough internet creating shift to processing and storage in large server farms, enabling both high-cost training and remote storage of large models

- Social media + search both enlisting consumers as data producers, and necessitating the creation of armies of Mturkers for content moderation + evaluation, later becoming available for tagging and rlhf

- A long-term shift to a text-oriented society, beginning with print capitalism and continuing through the rise of "knowledge work" through to the migration of daily tasks (work, bill paying, shopping) online, that allows a program that only produces text to appear capable of doing many of the things a person does

We may have previously had the technical ideas in the 1990s but we certainly didn't have the ripened infrastructure to put them into practice. If we had the dataset to create an LLM in the 90s, it still would have been astronomically cost-prohibitive to train, both in CPU and human labor, and it wouldn't have as much of an effect on society because you wouldn't be able to hook it up to commerce or day-to-day activities (far fewer texts, emails, ecommerce).