logoalt Hacker News

GoatInGreyyesterday at 4:59 PM1 replyview on HN

Possible, though you eventually run into types of issues that you recall the model just not having before. Like accessing a database or not following the SOP you have it read each time it performs X routine task. There are also patterns that are much less ambiguous like getting caught in loops or failing to execute a script it wrote after ten attempts.


Replies

merlindruyesterday at 9:28 PM

yes but i keep wondering if that's just the game of chance doing its thing

like these models are nondeterministic right? (besides the fact that rng things like top k selection and temperature exist)

say with every prompt there is 2% odds the AI gets it massively wrong. what if i had just lucked out the past couple weeks and now i had a streak of bad luck?

and since my expectations are based on its previous (lucky) performance i now judge it even though it isn't different?

or is it giving you consistenly worse performance, not able to get it right even after clearing context and trying again, on the exact same problem etc?