logoalt Hacker News

Aurornisyesterday at 10:16 PM0 repliesview on HN

Power consumption numbers aren't useful for efficiency calculations without also considering the tokens per second for the same model and quantization.

I could write an engine that only uses 10W on your machine, but it wouldn't be meaningful if it was also 10X slower.

More power consumption is usually an indicator that the hardware is being fully utilized, all things equal (comparing GPU to GPU or CPU to CPU, not apples to oranges)