logoalt Hacker News

h14hyesterday at 6:56 PM2 repliesview on HN

Would be really interesting to see an "Eager McBeaver" bench around this concept. When doing real work, a model's ability to stay within the bounds of a given task has almost become more important than its raw capabilities now that every frontier model is so dang good.

Every one of these models is so great at propelling the ship forward, that I increasingly care more and more about which models are the easiest to steer in the direction I actually want to go.


Replies

cglanyesterday at 7:06 PM

being TOO steerable is another issue though.

Codex is very steerable to a fault, and will gladly "monkey paw" your requests to a fault.

Claude Opus will ignore your instructions and do what it thinks is "right" and just barrel forward.

Both are bad and papering over the actual issue which is these models don't really have the ability to actually selectively choose their behavior per issue (ie ask for followup where needed, ignore users where needed, follow instructions where needed). Behavior is largely global

show 1 reply
h14hyesterday at 9:57 PM

For sure. I imagine it'd be pretty difficult to evaluate the "correct" amount of steer-ability. You'd probably just have to measure a delta in eagerness on a single same task between when given highly-specified prompts, and more open-ended prompts. Probably not dissimilar from how artificialanalysis.ai does their "omniscience index".