Will your program ever be fast if you don’t learn the microarchitecture of your CPU first? :)
PyPy is a valid option and one I would explore if it fits what you are doing.