Just brainstorming here, but would a distributed search index be possible / usable with current network speeds and latency? I'm not sure how to set up the data structure to not require many high latency jumps, but maybe someone has solved this problem.
It's possible, see the YaCy project. It suffer from probably a couple of orders of magnitude too few resources (in the funding/development sense) to really be competitive though.