It should be noted that MaxSAT 2024 did not include z3, as with many competitions. It’s possible (I’d argue likely) that the agent picked up on techniques from Z3 or some other non-competing solver, rather than actually discovering some novel approach.
Or for that matter even from later versions of the same solvers that were in its training data!
Is z3 competitive in SAT competitions? My impression was that it is popular due to the theories, the python API, and the level of support from MSR.
Z3 is capable (it’s an SMT solver, not just SAT), but it’s not very fast at boolean satifiability and not at all competitive with modern SOTA SAT solvers. Try comparing it to Chaff or Glucose e.g.