See https://github.com/geerlingguy/beowulf-ai-cluster/issues/17 for more data — I didn't save all the prompt processing times (Exo just outputs a time in ms, no other data for that), but will try to have another pass. Maybe also convince the Exo team to add a proper benchmarking capability ala `llama-bench` :)
or better, like you mentioned, try to convince Exo to develop in the open, so everyone gets any capability as PRs.