Think about it conceptually:
Could you watch a music video and say "that's the snare drum, that's the lead singer, keyboard, bass, that's the truck that's making the engine noise, that's the crowd that's cheering, oh and that's a jackhammer in the background"? So can AI.
Could you point out who is lead guitar and who is rhythm guitar? So can AI.
I mean, sometimes I -mixing- a show and I couldn't tell you where a specific sound is coming from....