Has anyone successfully run this on a Mac? The installation instructions appear to assume an NVIDIA ...

rahimnathwani • today at 5:22 PM • 3 replies • view on HN

Has anyone successfully run this on a Mac? The installation instructions appear to assume an NVIDIA GPU (CUDA, FlashAttention), and I’m not sure whether it works with PyTorch’s Metal/MPS backend.

Replies

magicalhippo • today at 6:53 PM

FWIW you can run the demo without FlashAttention using --no-flash-attn command-line parameter, I do that since I'm on Windows and haven't gotten FlashAttention2 to work.

turnsout • today at 6:57 PM

It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!

javier123454321 • today at 5:32 PM

I recommend using modal for renting the metal.

alt Hacker News

Replies