That’s always been possible with MPS backend, the reason people choose to omit it in HF spaces/...

villgax • today at 1:34 AM • 3 replies • view on HN

That’s always been possible with MPS backend, the reason people choose to omit it in HF spaces/demos is that HF doesn’t offer an MPS backend. People would rather have the thing work at best speeds than 10x worse speeds just for compatibility.

Replies

shivampkumar • today at 3:22 AM

IMO TRELLIS.2 is slightly different case from the HF models scenario. It depends on five compiled CUDA-only extensions -- flex_gemm for sparse convolution, flash_attn, o_voxel for CUDA hashmap ops, cumesh for mesh processing, and nvdiffrast for differentiable rasterization. These aren't PyTorch ops that fall back to MPS -- they're custom C++/CUDA kernels. The upstream setup.sh literally exits with "No supported GPU found" if nvidia-smi isn't present. The only reason I picked this up because I thought it was cool and no one was working on this open issue for Silicon back then (github.com/microsoft/TRELLIS.2/issues/74) requesting non-CUDA support.

Reubend • today at 2:09 AM

Are you saying the original one worked with MPS? Or are you just saying it was always theoretically possible to build what OP posted?

refulgentis • today at 1:54 AM

It’s always been possible, but it’s not possible because there’s no backend, and no one wants to it to be possible because everyone needs it 10x the speed of running on a Mac? I’m missing something, I think.

➕ show 1 reply

alt Hacker News

Replies