logoalt Hacker News

Matrix Multiplications on GPUs Run Faster When Given "Predictable" Data (2024)

103 pointsby toshlast Saturday at 12:11 PM31 commentsview on HN

Comments

dan_sbltoday at 3:14 PM

> For example, when the GPU is fully idle, nvidia-smi tells me that it’s only pulling 88W of power.

I haven't used a non-laptop GPU in some time, but that is a crazy amount of "idle" power consumption. Is this normal for cards like this?

show 1 reply
ggambettatoday at 5:04 PM

I'd have guessed multiply-by-0 and multiply-by-1 can be special-cased to run much faster and simpler code paths, like you'd do when writing MUL for a processor that doesn't have it (I <3 z80)

show 1 reply
ameliustoday at 3:17 PM

Sounds like a side channel attack waiting to happen.

show 1 reply
jayd16today at 2:56 PM

I can't tell from the blog, is this actually verified or is it theory and then numbers showing plausibility?

I could certainly come up with alternative theories about memory compression and prefetching if we were talking about texture reads.

jetsamflotsamtoday at 4:20 PM

I feel like many of the comments missed the point or didn't read the article. What I believe this article is stating (and I've read this many times during my PhD for various reasons), is that the input data distributions affect how many transistor state changes there are during multiplication. Since these events are a large portion of energy loss/heat generation, the clocks won't be throttled as much for certain data patterns.

There was a workshop paper from SC24 that did more experiments around this I believe. I can't find it now though.

nzachtoday at 2:27 PM

I went in expecting to find 'branch prediction'[0] as the answer, but apparently things are even more complex nowadays.

[0] - https://stackoverflow.com/questions/11227809/why-is-conditio...

show 1 reply
gdevenyitoday at 12:53 PM

People have been noticing the effects of this in local LLM inference. Power limiting seems to improve overall performance!

show 2 replies
bitwizetoday at 3:20 PM

It wouldn't surprise me to see some ML algorithm in silico somewhere to select faster matmul paths on favorable data. Yo dawg, I heard you like AI, so we put some AI in your AI so you can infer while you're inferring.

show 2 replies
cold_harbortoday at 2:25 PM

[dead]