My prompting is conservative to err on the side of playing an ad if there is a chance it might be part of the actual content, not really getting false positives at all yet. That being said while still in development I haven't reached the stage of running on a huge collection of podcasts to get more representative statistics.
I think the accuracy of my prompt/llm is also ~85%. I've got a collection 2500+ podcast episode transcripts (English language) with ads I'm going to try and analyze shortly to find out if I'm missing any ads, or tagging some falsely.