This is an interesting sentiment given how desperate AI labs seem to be source any new internet content from any walled-garden platform willing to take their money (and how willing they are to try & take it even if you don't consent).
Abusive, sneaky scraping is absolutely through the roof.
I feel as though you are confusing AI use in scraping by random companies and actual AI companies scraping. The AI companies seem to see value in walled garden sources like Reddit, Stack Overflow, etc. However, I don't think there has been any major instance of a major American AI company doing aggressive online website scraping and not respecting robot.txt.