right, 888 kB would be impossible for local inference
however, it is really not that impressive for just a client
It's not completely impossible, depending on what your expectations are. That language model that was built out of redstone in minecraft had... looks like 5 million parameters. And it could do mostly coherent sentences.
It's not completely impossible, depending on what your expectations are. That language model that was built out of redstone in minecraft had... looks like 5 million parameters. And it could do mostly coherent sentences.