If anyone is curious, the most common tool I've seen for ELO estimation among engine developers is cutechess [1], which uses SPRT [2]. Or ordo [3], haven't used this myself though
[1] https://cutechess.com/
[2] https://www.chessprogramming.org/Sequential_Probability_Rati...
[3] https://github.com/michiguel/Ordo