Efficient execution on the GPU appears to have been one of the specific aims of the authors. Table 2...

fc417fc802 • today at 1:20 PM • 1 reply • view on HN

Efficient execution on the GPU appears to have been one of the specific aims of the authors. Table 2 of their paper shows real world performance that would appear at a glance to be compatible with inference.

Replies

mskkm • today at 1:39 PM

This is not an LLM inference result. Table 2 is the part I find most questionable. Claiming orders-of-magnitude improvements in vector search over standard methods is an extraordinary claim. If it actually held up in practice, I would have expected to see independent reproductions or real-world adoption by now. It’s been about a year since the paper came out, and I haven’t seen much of either. That doesn’t prove the claim is false, but it certainly doesn’t inspire confidence.

alt Hacker News

Replies