I don't know about this particular case though, I get the feeling there's a system to it that can be exploited by eg Wolfram. It's just that you're in the dark for a long time before you find the switch.
Your intuition is right. There is a general algorithm for finding the antiderivatives: https://en.wikipedia.org/wiki/Risch_algorithm Its simplified form can solve pretty much all the undergrad antiderivation problems.
I'm a math major, but I consider the time spent learning the tricks for antiderivation to be kinda useless.
I think it just tokenizes everything and does pattern matching to find compositions it can exploit. It's not unlike compiler optimization.