why would it early stop? examples?

rtgfhyuj • last Thursday at 9:29 PM • 2 replies • view on HN

Replies

Models just naturally arrive at a conclusion that they are done. TODO hints can help, but is not infallible: Claude will stop and happily report there's more work to be done and "you just say the word Mister and I'll continue" --- this is a RL problem where you have to balance the chance of an infinite loop (it keeps thinking there's a little bit more to do when there is not) versus the opposite where it stops short of actual completion.

➕ show 1 reply

embedding-shape • last Thursday at 9:35 PM

Not all models are trained with long one-shot task following by themselves, seems many of them prefer closer interactions with the user. You could always add another layer/abstraction above/below to work around it.

➕ show 1 reply

alt Hacker News

Replies