The detection problem is genuinely hard. Even desktop AI agents I've been working with recently can control Spotify, fill forms, navigate apps — all indistinguishable from human interaction at the OS level. If that's hard to detect at the application layer, detecting AI-generated music at the audio layer seems like a cat and mouse game that Tidal will struggle to win without self-reporting from uploaders.
I feel like audio-level heuristics will be easier, but ultimately who's to say?
> Generative models synthesize sound mathematically. These synthesis methods leave unnatural dips, specific spectral noise profiles, or phase alignments that rarely occur in real, human-recorded audio
You are correct, but I think having a good policy - and trying earnestly to enforce it - is a good start, even if that enforcement is very imperfect.
Let's go with impossible.
Maybe if enough AI produces self-report their work as AI, and enough non-AI producers are honest about uploading non-AI work, they'll quickly have the necessary amount of good-enough data to train good classifiers?