GPUs are more efficient than CPUs for LLM inference, using less energy per token and being ch...

yorwba • yesterday at 8:18 PM • 1 reply • view on HN

GPUs are more efficient than CPUs for LLM inference, using less energy per token and being cheaper overall. Yes, a single data center GPU draws a lot of power and costs a fortune, but it can also serve a lot more people in the time your CPU or consumer GPU needs to respond to a single prompt.

Replies

tolerance • yesterday at 8:22 PM

I got you, thanks!

alt Hacker News

Replies