logoalt Hacker News

carterschonwaldtoday at 3:00 PM1 replyview on HN

this is literally just “leave a child at the work computer with a real doc open playing office”. otoh it is good to design benchmarks tonground these things.

on the flip side if you’re literally just using a bare bones harness on top of a stochastic parrot, of course stochastic errors accumulate.

theres a lot of ways for improving text faithfulness through harness tool designs, and my incremental experiments seem promising.

but unless work is gated on shit like “the script used must type checked ghc haskell or lean4”, unsupervised stuff is gonna decay


Replies

rhubarbtreetoday at 4:54 PM

It’s not a stochastic parrot.