> If I run a big data hungry AI lab consuming training data at 100Gb/s it's much much easier to scrape 10,000 sites at 10Mb/s than DDOS a smaller number of sites with more traffic
A little over a decade ago (f*ck I'm old now [0]), I had a similar conversation with an ML Researcher@Nvidia. Their response was "even if we are overtraining, it's a good problem to have because we can reduce our false negative rate".
Everyone continues to have an incentive to optimize for TP and FP at the expense of FN.