If the benchmarks are private, how do we reproduce the results? I looked up the Humanity's Last...

Banditoz • today at 4:25 PM • 1 reply • view on HN

If the benchmarks are private, how do we reproduce the results? I looked up the Humanity's Last Exam (https://agi.safe.ai/) this model uses and I can't seem to access it.

Replies

johndough • today at 5:09 PM

You can request access here: https://huggingface.co/datasets/cais/hle

The test data is purposely difficult to access to reduce the chance of leaking it into the training dataset.

alt Hacker News

Replies