Agreed. Would be great if everyone starts reporting cost per task alongside eval scores, especially in a world where you can spend arbitrary test-time compute. This is one thing I like about the Artificial Analysis website - they include cost to run alongside their eval scores: https://artificialanalysis.ai/