This doesn't work for the tasks people are worried about because they want to lean on the gener...

BoorishBears • today at 6:57 AM • 0 replies • view on HN

This doesn't work for the tasks people are worried about because they want to lean on the generalization of the model + tool calling.

What you're describing is also already mostly achieved by using constrained decoding: if the injection would work under constrained decoding, it'll usually still work even if you SFT heavily on a single task + output format

alt Hacker News