logoalt Hacker News

armcattoday at 5:33 PM0 repliesview on HN

I really like BAML but this post seems a little too much like a BAML funnel. Here are three methods that worked for me consistently since constrained sampling first came out:

1. Add a validation step (using a mini model) right at the beginning - sub-second response times; the validation will either emit True/False or emit a function call

2. Use a sequence of (1) large model without structured outputs for reasoning/parsing, chained to (2) small model for constrained sampling/structured output

3. Keep your Pydantic models/schemas as flat (not too nested and not too many enumarations) and "help" the model in the system prompt as much as you can