logoalt Hacker News

Retrictoday at 8:27 AM0 repliesview on HN

> seem to be fine

Now repeat the question to the same model in different contexts several times and count what percentage of the time it’s correct.