What is the specific concrete purpose of downloading millions of URLs per hour across different domains if it's "not doing anything wrong"?
Mostly ecommerce and pricing data. I work for marketplaces, brands, retail stores and even our own saas competitors. We match the EAN (gtin) to the correct SKU within seconds (Google Shopping, Amazon, etc). Part of it is our own trained ML models.
Might be it for scrapping content for training an LLM? Oh no only big tech allowed to do it...
Mostly ecommerce and pricing data. I work for marketplaces, brands, retail stores and even our own saas competitors. We match the EAN (gtin) to the correct SKU within seconds (Google Shopping, Amazon, etc). Part of it is our own trained ML models.