logoalt Hacker News

shihabyesterday at 11:30 PM2 repliesview on HN

> For example, NEON ... can hold up to 32 128-bit vectors to perform your operations without having to touch the "slow" memory.

Something I recently learnt: the actual number of physical registers in modern x86 CPUs are significantly larger, even for 512-bit SIMD. Zen 5 CPUs actually have 384 vectors registers, 384*512b = 24KB!


Replies

cmovqtoday at 12:05 AM

This is true, but if you run out of the 32 register names you’ll still need to spill to memory. The large register file is to allow for multiple instructions to execute in parallel among other things.

show 2 replies
dapperdrakeyesterday at 11:59 PM

In the register file or named registers?

And the critical matrix tiling size is often SRAM, so L3 unified cache.