logoalt Hacker News

guerythontoday at 1:12 AM0 repliesview on HN

Instant is the low-latency analysis stage for us. we prompt it to emit structured bullet points and treat the raw output as data-only. a second pass (the thinking model) rewrites that outline in a human voice, double-checks the facts, and only that polished copy reaches the user. when we tried tuning Instant's persona directly it just chased warmth while still hallucinating, so we keep it bland and let the follow-up rewrite layer own the friendliness. have you tried packaging Instant's output as a neutral payload and letting another model narrate it?