What would that breakthrough be?
Magic math and computer science that allows us to get the same quality response for a fraction of the GPU.
I'd assume its a totally different architecture that isn't based on storing a compressed dataset of all digital human text.
Magic math and computer science that allows us to get the same quality response for a fraction of the GPU.