The big companies tend to respect robots.txt. The problem is other, unscrupulous actors use fake user agents and residential IPs and don't respect robots.txt or act reasonably.
Big companies have thrown robots.txt to the wind when it comes to their precious AI models.
Big companies have thrown robots.txt to the wind when it comes to their precious AI models.