> Also, cheaper... X99 + 8x DDR4 + 2696V4 + 4x Tesla P4s running on llama.cpp. Total cost about $500 including case and a 650W PSU, excluding RAM.
Excluding RAM in your pricing is misleading right now.
That’s a lot of work and money just to get 10 tokens/sec