logoalt Hacker News

petesergeant12/09/20241 replyview on HN

This isn’t a problem in practice. Most of my prompts ask the LLM to do a bunch of chain of thought before asking them to spit out JSON. I extract the JSON, which works 97.5% of the time, and have a retry step being real specific about “here’s the conversation so far but I need JSON now” that handles the rest. Adding examples really helps.


Replies

imtringued12/09/2024

https://lmsys.org/blog/2024-02-05-compressed-fsm/

I'm not trying to shill sglang specifically, just pointing out that there's a better way, btw.

show 1 reply