logoalt Hacker News

zamadatix10/12/20243 repliesview on HN

CCDs can't access each other's L3 cache as their own (fabric penalty is too high to do that directly). Assuming it's anything like the 9174F that means it's really 8 groups of 2 cores that each have 64 MB of L3 cache. Still enormous, and you can still access data over the infinity fabric with penalties, but not quite a block of 512 MB of cache on a single 16 core block that it might sound like at first.

Zen 4 also had 96 MB per CCD variants like the 9184X, so 768 MB per, and they are dual socket so you can end up with a 1.5 GB of total L3 cache single machine! The downside being now beyond CCD<->CCD latencies you have socket<->socket latencies.


Replies

edward2810/12/2024

It's actually 16 CCDs with a single core and 32MB each.

nullc10/13/2024

9684x is 1152 MB cache per socket, 12 CCDs * 96MB. A similar X series zen5 is planned.

Though I wish they did some chips with 128GB of high bandwidth dram instead of a extra sized sram caches.

bee_rider10/13/2024

Hmm. Ok, instead of treating the cache as ram, we will have to treat each CCD as a node, and treat the chip as a cluster. It will be hard, but you can fit quite a bit in 64MB.