logoalt Hacker News

fulafeltoday at 5:24 AM1 replyview on HN

Interesting that OpenBLAS and MPS are reportedly nearly the same speed although the README sounds like only MPS uses the GPU.


Replies

antireztoday at 9:05 AM

I think that this is because the current code does a terrible job at taking the activations in the GPU and fusing the kernels. This is the next thing to fix in this implementation indeed.