I see a lot of references to `device_map="cuda:0"` but no cuda in the github repo, is the ...

JonChesterfield • today at 6:01 PM • 0 replies • view on HN

I see a lot of references to `device_map="cuda:0"` but no cuda in the github repo, is the complete stack flash attention plus this python plus the weights file, or does one need vLLM running as well?

alt Hacker News