You limit the crawl time or number of requests per domain for all domains, and set the limit proportional to how important the domain is.
There's a ton of these types of of things online, you can't e.g. exhaustively crawl every wikipedia mirror someone's put online.