I don't really understand the internal mechanics of of this, but my first thought was why not combine this with a linter/tests. So that it produces all the forks and only keeps the syntactically correct ones.
That’s going to be inefficient when most of the generations have broken syntax and can’t even parse.
That’s going to be inefficient when most of the generations have broken syntax and can’t even parse.