This is an absolutely crazy wasteful thing to do considering the actual cost of all that inference a...

hypfer • today at 7:45 AM • 4 replies • view on HN

This is an absolutely crazy wasteful thing to do considering the actual cost of all that inference and nothing to be proud of.

Replies

loehnsberg • today at 8:30 AM

Unless we do our own benchmarks, we have to take all the marketing fluff from the frontier labs at face value, and all public benchmarks degrade eventually as labs optimize towards them. OP’s approach is wasteful because it is brute force, but post says that an ELO is kept, so this is also an experiment, and I don‘t see what‘s wrong with that. You learn which model performs well in which settings which may save resources later. It‘s also wasteful to keep working with the wrong model/harness/tools for too long.

mg • today at 8:17 AM

It is the other way round.

In an interactive session, adding "Fine, but make the button red" after the model generated a first solution more than doubles the tokens used. As the model now not only gets the original code and the feature request but also the updated code plus the change request as input tokens.

Sending a feature request to an LLM and then sending the feature request again with "The button shall be red" only doubles the tokens used.

➕ show 3 replies

redox99 • today at 8:18 AM

Probably like 1% of the energy an average person spends on driving.

➕ show 1 reply

cyanydeez • today at 8:05 AM

come on now, we can't just not escape the permanent underclass by using our brains, we've also got to use up all the resources while doing it.

alt Hacker News

Replies