With Thunderbolt 5 and M5 Ultras, Apple could be building lower cost clusters that could possibly scale enough while keeping a lower power budget. Obviously that can't compete with NVIDIA racks, but for mobile consumer inference maybe that would be enough?
Apple just announced it
https://news.ycombinator.com/item?id=46248644