Don't buy the Mini or Studio. Both have the M4 which lacks the Neural Accelerators, making prompt processing ~3-4x slower.
I assume those don't just work automatically with an off-the-shelf gguf. What do you need in your local inference stack to take advantage of M5's neural accelerators?
Apple Mac Studio (M3 Ultra Chip/28 CPU, 60 GPU/96 GB/1 TB
How is this config?
I assume those don't just work automatically with an off-the-shelf gguf. What do you need in your local inference stack to take advantage of M5's neural accelerators?