I have seen many LLM devs' encountered this at some point. Good to see that you are not only po...

pawanjswal • last Wednesday at 5:19 AM • 0 replies • view on HN

I have seen many LLM devs' encountered this at some point. Good to see that you are not only pointing out the inconsistency but also actively advocating a common benchmark.

alt Hacker News