The weirdest one of the bunch is the AMD EPYC 9175F: 16 cores with 512MB of L3 cache! Presumably thi...

smolder • 10/12/2024 • 12 replies • view on HN

The weirdest one of the bunch is the AMD EPYC 9175F: 16 cores with 512MB of L3 cache! Presumably this is for customers trying to minimize software costs that are based on "per-core" licensing. It really doesn't make much sense to have so few cores at such an expense, otherwise. Does Oracle still use this style of licensing? If so, they need to knock it off.

The only other thing I can think of is some purpose like HFT may need to fit a whole algorithm in L3 for absolute minimum latency, and maybe they want only the best core in each chiplet? It's probably about software licenses, though.

Replies

bob1029 • 10/12/2024

Another good example is any kind of discrete event simulation. Things like spiking neural networks are inherently single threaded if you are simulating them accurately (I.e., serialized through the pending spike queue). Being able to keep all the state in local cache and picking the fastest core to do the job is the best possible arrangement. The ability to run 16 in parallel simply reduces the search space by the same factor. Worrying about inter CCD latency isn't a thing for these kinds of problems. The amount of bandwidth between cores is minimal, even if we were doing something like a genetic algorithm with periodic crossover between physical cores.

londons_explore • 10/12/2024

Plenty of applications are single threaded and it's cheaper to spend thousands on a super fast CPU to run it as fast as possible than spend tens of thousands on a programmer to rewrite the code to be more parallel.

And like you say, plenty of times it is infeasible to rewrite the code because its third party code for which you don't have the source or the rights.

bee_rider • 10/12/2024

512 MB of cache, wow.

A couple years ago I noticed that some Xeons I was using had a much cache as the ram in the systems I had growing up (millennial, so, we’re not talking about ancient commodores or whatever; real usable computers that could play Quake and everything).

But 512MB? That’s roomy. Could Puppy Linux just be held entirely in L3 cache?

➕ show 2 replies

Jestzer • 10/12/2024

MATLAB Parallel Server also does per-core licensing.

https://www.mathworks.com/products/matlab-parallel-server/li....

Aurornis • 10/12/2024

Many algorithms are limited by memory bandwidth. On my 16-core workstation I’ve run several workloads that have peak performance with less than 16 threads.

It’s common practice to test algorithms with different numbers of threads and then use the optimal number of threads. For memory-intensive algorithms the peak performance frequently comes in at a relatively small number of cores.

➕ show 1 reply

RHab • 10/12/2024

Abaqus for example is by core, I am severly limited, for me this makes totally sense.

heraldgeezer • 10/12/2024

Windows server and MSSQL is per core now. A lot of enterprise software is. They are switching to core because before they had it based on CPU sockets. Not just Oracle.

aecmadden • 10/14/2024

This optimises for a key vmware license mechanism "Per core licensing with a minimum of 16 cores licensed per CPU.".

puzzlingcaptcha • 10/12/2024

Windows server licensing starts at 16 cores

forinti • 10/12/2024

You can pin which cores you will use and so stay within your contract with Oracle.

elil17 • 10/14/2024

Many computational fluid dynamics programs have per core licensing and also benefit from large amounts of cache.

yusyusyus • 10/12/2024

new vmware licensing is per-core.

alt Hacker News

Replies