logoalt Hacker News

ychen306last Saturday at 8:29 PM3 repliesview on HN

It's orders of magnitude cheaper to serve requests with conventional methods than directly with LLM. My back-of-envelope calculation says, optimistically, it takes more than 100 GFLOPs to generate 10 tokens using a 7 billion parameter LLM. There are better ways to use electricity.


Replies

sramamlast Saturday at 8:56 PM

I work in enterprise IT and sometimes wonder if we should add the equivalent energy calculations of human effort - both productive and unproductive - that underlies these "output/cost" comparisons.

I realize it sounds inhuman, but so is working in enterprise IT! :)

show 3 replies
ls-alast Saturday at 8:48 PM

Try to convince the investors. The way the industry is headed is not necessarily related to what is most optimal. That might be the future whether we like it or not. Losing billions seems to be the trend.

show 2 replies
nradovlast Saturday at 10:56 PM

Sure, but we can start with an LLM to build V1 (or at least a demo) faster for certain problem domains. Then apply traditional coding techniques as an efficiency optimization later after establishing product-market fit.