It's orders of magnitude cheaper to serve requests with conventional methods than directly with...

ychen306 • last Saturday at 8:29 PM • 3 replies • view on HN

It's orders of magnitude cheaper to serve requests with conventional methods than directly with LLM. My back-of-envelope calculation says, optimistically, it takes more than 100 GFLOPs to generate 10 tokens using a 7 billion parameter LLM. There are better ways to use electricity.

Replies

sramam • last Saturday at 8:56 PM

I work in enterprise IT and sometimes wonder if we should add the equivalent energy calculations of human effort - both productive and unproductive - that underlies these "output/cost" comparisons.

I realize it sounds inhuman, but so is working in enterprise IT! :)

➕ show 3 replies

ls-a • last Saturday at 8:48 PM

Try to convince the investors. The way the industry is headed is not necessarily related to what is most optimal. That might be the future whether we like it or not. Losing billions seems to be the trend.

➕ show 2 replies

nradov • last Saturday at 10:56 PM

Sure, but we can start with an LLM to build V1 (or at least a demo) faster for certain problem domains. Then apply traditional coding techniques as an efficiency optimization later after establishing product-market fit.

alt Hacker News

Replies