Even the smaller quantized models which can run on consumer hardware pack in an almost unfathomable amount of knowledge. I don't think I expected to be able to run a 'local Google' in my lifetime before the LLM boom.
I'm extremely curious how these models learn to pack a lossily-compressed representation of the entire Internet (more or less) into a few hundred billion parameters. like, what's the ontology?
I'm extremely curious how these models learn to pack a lossily-compressed representation of the entire Internet (more or less) into a few hundred billion parameters. like, what's the ontology?