In addition both have a property similar to dispersion. In crypto each change to an input bit should cascade through as many output bits as possible. In ML each output bit should depend on as much of the input bits (and hidden layers) as possible. So they both feature a similar maximization of entropy.