We had a workshop 6 months ago and while I've always been sceptical of OpenAI,etc's silly AGI/ASI claims, the investments have shown the way to a lot of new technology and has opened up a genie that won't be put back into the bottle.
Now extrapolating in line with how Sun servers around year 2000 cost a fortune and can be emulated by a 5$ VPS today, Apple is seeing that they can maybe grab the local LLM workloads if they act now with their integrated chip development.
But to grab that, they need developers to rely less on CUDA via Python or have other proper hardware support for those environments, and that won't happen without the hardware being there first and the machines being able to be built with enough memory (refreshing to see Apple support 128gb even if it'll probably bleed you dry).
Torch mlp support on my local macbook outperforms CUDA T4 on Colab.
Except CUDA feels really cozy, because like Microsoft, NVidia understands the Developers, Developers, Developers mantra.
People always overlook that CUDA is a polyglot ecosystem, the IDE and graphical debugging experience where one can even single step on GPU code, the libraries ecosystem.
And as of last year, NVidia has started to take Python seriously and now with cuTile based JIT, it is possible to write CUDA kernels in pure Python, not having Python generate C++ code that other tools than ingest.
They are getting ahead of Modular, with Python.
I feel like the push by devs towards Metal compatibility has been 10x than AMD. I assume that's because the majority of us run MacBooks.