logoalt Hacker News

BoredPositrontoday at 3:52 PM1 replyview on HN

To be clear, we are not discussing small toy models but to be fair I also don't use consumer cards. Benchmarks are out there (phoronix, runpod, hugginface or from Nvidias own presentation) and they say it's at least 2x on high and nearly 4x on low precision, which is comparable to the uplift I see on my 6000 cards, if you don't see the performance uplift everyone else sees there is something wrong with your setup and I don't have the time to debug it.


Replies

qayxctoday at 9:13 PM

> To be clear, we are not discussing small toy models but to be fair I also don't use consumer cards.

> if you don't see the performance uplift everyone else sees there is something wrong with your setup and I don't have the time to debug it.

Read these two statements and think about what might be the issue. I only run what you call "toy models" (good enough for my purposes), so of course your experience is fundamentally different from mine. Spending 5 figures on hardware just to run models locally is usually a bad investment. Repurposing old hardware OTOH is just fine to play with local models and optimise them for specific applications and workflows.