This isn’t a problem in practice. Most of my prompts ask the LLM to do a bunch of chain of thought b...

petesergeant • 12/09/2024 • 1 reply • view on HN

This isn’t a problem in practice. Most of my prompts ask the LLM to do a bunch of chain of thought before asking them to spit out JSON. I extract the JSON, which works 97.5% of the time, and have a retry step being real specific about “here’s the conversation so far but I need JSON now” that handles the rest. Adding examples really helps.

Replies

imtringued • 12/09/2024

https://lmsys.org/blog/2024-02-05-compressed-fsm/

I'm not trying to shill sglang specifically, just pointing out that there's a better way, btw.

➕ show 1 reply

alt Hacker News

Replies