logoalt Hacker News

nh43215rgblast Sunday at 10:38 AM0 repliesview on HN

In practical generative AI workflows (LLMs), I think AMD Max+395 chips with unified memory is as good as Mac Studio or MacBook Pro configurations in handling big models locally and support fast inference speeds (However Top-end Apple silicon (M4 Max, Studio Ultra) can reach 546GB/s memory bandwidth, while the AMD unified memory system is around 256GB/s). I think for inference use either will work fine. For everything else I think CUDA ecosystem is a better bet (correct me if I'm wrong).