So why was there crawling in 1998 but no LLMs?
I am unable to comprehend the state of mind that would lead one to ask this question.
We didn't have GPUs with hundreds of gigabytes of VRAM and tensor processing cores.
what kind of completely retarded non sequitur is this?
Because the transformer, which all of these models are foundationally built off of and didn't invent themselves (bar google) wasn't invented? The amount of effort it took humanity to generate all the data that was required for the models to get to the point they're at now is absolutely not even comparable to how much effort it took to build the model code. Yeah, it's complicated, but if they didn't rip off all of humanities combined output it wouldn't even matter if the transformer got invented.