NUMA has a huge amount of overhead (e.g. in terms of intercore latency), and NUMA server CPUs cost a lot more than single socket boards. If you look at the servers at Google or Facebook they will have some NUMA servers for certain workloads that actually need them, but most most servers will be single socket because they're cheaper and applications literally run faster on them. It's a win win if you can fit your workload on a single socket server so there is a lot of motivation to make applications work in a non-NUMA way if at all possible.