This. It's awful to wait 15 minutes for M3 Ultra to start generating tokens when your coding ag...

storus • yesterday at 3:03 PM • 0 replies • view on HN

This. It's awful to wait 15 minutes for M3 Ultra to start generating tokens when your coding agent has 100k+ tokens in its context. This can be partially offset by adding DGX Spark to accelerate this phase. M5 Ultra should be like DGX Spark for prefill and M3 Ultra for token generation but who know when it will pop up and for how much? And it still will be at around 3080 GPU levels just with 512GB RAM.

alt Hacker News