logoalt Hacker News

bthornburyyesterday at 9:30 PM0 repliesview on HN

Something like a perplexity/log-likelihood measurement across a large enough number of prompts/tokens might get you the same in a statistical sense though. I expect those comparison percentages at the top are something like that.