> memory bandwidth is always the bottleneck
I'm hoping that today's complaints are tomorrow's innovations. Back when 1Mb hard drive was $100,000, or when Gates said 640kb is enough.
Perhaps some 'in the (chip) industry' can comment on what RAM manufacturers are doing at the moment - better, faster, larger? Or is there not much headroom left and it's down to MOBO manufacturers, and volume?
For larger contexts, the bottleneck is probably token prefill instead of memory bandwidth. Supposedly prefill is faster on the M5+ GPUs, but still a big hurdle for pre-M5 chips.
Chip speed has increased faster than memory speed for a long time now, leaving DRAM behind. GDDR was good for awhile but is no longer sufficient. HBM is what's used now.
The last logical step of this process would be figuring out how to mix the CPU transistors with the RAM capacitors on the same chip as opposed to merely stacking separate chips on the same package.
A related stopgap is the AI startup (forget which) making accelerators on giant chips full of SRAM. Not a cost effective approach outside of ML.