I guess you're right in a purely geometric sense. It's just that it seems almost silly to consider that given that (AIUI) the 3D geometric constraints don't impact the memory access latency at all for now (and likely for any reasonable period of time).
Like you said, thermal and cost constraints dwarf the geometrical one. But I guess my point is that they make it a non-issue and therefore isn't a sound theoretical explanation as to why memory access is O(N^[1/3]).
The speed of light is roughly 30cm/ns. So accessing main memory 15cm away on a motherboard is about 0.5ns slower than cache, no matter whether the main memory is DRAM or SRAM. That's 2 clock cycles extra at 4GHz, which is a tiny fraction of the actual time (somewhere around 100ns) but not negligible. 0.5% or so is just enough that I'd say it can matter. Particularly since larger computers end up putting some of their RAM further away.
geometric constraints don't impact the memory access latency at all for now
I don't know; every time Intel/AMD increase cache size it also takes more cycles. That sounds like a speed of light limit.
Should thermal and cost constraints at scale not also tend to relate to the volume of the individual components in the same way (ignoring constant factors) as the growth factors for an idealized memory structure around the CPU itself? In a more literal sense: the size and quantity of transistors (or other alternative units) also describe the cost, heat dissipation, and volume of the memory simultaneously. Tweaking any of the parameters still ultimately results in a "how much can we handle in that volume of product" equation, which will be the ultimate bound.
The difference is we spread them out into differently optimized volumes instead of build a homogenous cube, which is (most likely IMO) where most of the constant factors come from.
I think this is the part the article glossed over to just get to showing the empirical results, but I also don't feel it's an inherently unreasonable set of assumptions. At the very least, matches what the theoretical limit would be in the far future even if it were to only happen to coincidentally match current systems for other reasons.