What's fun is you can still do black box probing. And guess what, spammers have done this.
I get these emails that look like classic spam like a link to a home depot or wallmart giftcard, but they're addressed to someone who isn't me. After getting a bunch of these I decided to look at the original email. They are being sent to an outlook (e.g. [email protected]) and appear from something that looks like a store (e.g. [email protected]>). It passes SPF and DMARC but fails DKIM.
The content?
It used to be PAGES of stuff like "here's your email password reset link" or "thank you for signing up <legitimate place>". I was confused at first but then realized that yeah, this stuff likely bypass a ML filter. But the spammers have gotten better at it and now they can do it with only a page of content.
Of course, I can easily filter these by just parsing the "To Address" (I use Thunderbird). But I reported tons of these and was deleting them. But in middle of last year I decided to just start collecting them. I have over 50...
This is low hanging fruit stuff... Like a Naive Bayes could handle this. The current solution could probably handle it if they started actually fucking labeling the examples as spam and assumed that the labeling process was noisy (dear god I hope they use at least "legit" "unknown" "spam" and don't assume legit if it isn't marked as spam...)
I have EVEN TALKED TO A PERSON and the issue couldn't be escalated... Which IMO is being complacent in spam.