>I can just as well describe the future evolution of the internal combustion engine and claim it will get more and more efficient and eventually we will be able to burn oil so efficiently that our personal vehicles can fly through the atmosphere at twice the speed of sound.
This is a silly comparison. There is a certain quantity of energy stored in oil, so we know what peak efficiency looks like. We don't actually know what amount of energy is required to solve certain problems. We quite literally have models with quite a bit of capability that can run locally on a phone today, right alongside Stockfish, for example.
And this is to say nothing of work happening now on new hardware approaches, such as Normal Computing's work on thermodynamic matrix math: https://www.normalcomputing.com/blog/a-first-demonstration-o...
That said, this feels like a strange tangent: I'm not sure it's that important that the models be as energy efficient as a human brain. We don't avoid cars because they're less energy efficient than our legs. ;)
Point is that both are science fiction narratives and neither reflect reality in any way what-so-ever. How fast a car can drive and how much a LLMs can compute are bounded quantities, limited by the physical reality. In both cases we can imagine a world where this limit does not exist, but that is not the reality we live in.
This matters because unlike cars LLMs are only doing stuff we can already do using our brains, just several orders of magnitudes less efficiently. Cars can at least take us distances we would never be able to using our muscles. In comparison, if I need to compile CPython into a WASM binary I can simply download a library that does it, or copy paste code in a few seconds, for a million billionth of the energy it takes an LLM to do the same. Except when I download the library or copy-paste the code I (hopefully) attribute the original author and give them credit for their work.