logoalt Hacker News

CuriouslyCtoday at 12:57 AM1 replyview on HN

Failures of workflows signal assumption violations that ultimately should percolate up to humans. Also, static dags are more amenable to human understanding than dynamic task decomposition. Robustness in production is good though, if you can bound agent behavior.

Best of 3 (or more) tournaments are a good strategy. You can also use them for RL via GRPO if you're running an open weight model.


Replies

ipnontoday at 1:22 AM

In HNese this means "very impressive, keep up the good work."