It might be an interesting LLM benchmark: how many can they list without breaking the rules (repetit...

the_fall • today at 4:34 AM • 2 replies • view on HN

It might be an interesting LLM benchmark: how many can they list without breaking the rules (repetition or non-animals). Although I bet that big bucks would be then thrown at pointlessly optimizing for that benchmark, so...

Replies

bronco21016 • today at 4:43 AM

Might be an interesting problem for understanding how various models perform recollection of prior tokens within the context window. I'm sure they could list animals until their window is full but what I'm not sure of is how much of the window they could fill without repeating.

➕ show 2 replies

OxfordOutlander • today at 5:16 AM

you might like https://github.com/aidanmclaughlin/AidanBench

alt Hacker News

Replies