logoalt Hacker News

DeusExMachinayesterday at 10:28 PM12 repliesview on HN

I don't understand the endgame here. Websites let Google crawl their content in exchange of traffic. If Google cuts that out completely, what incentive do websites have to not block the Google crawlers?

I understand that Google is feeling an existential threat from other AI products that provide answers directly. But they must also understand their symbiotic relationship with the web.


Replies

AndroTuxyesterday at 10:36 PM

The end game is the consumer no longer leaving Google and the web becoming synonymous to Google for them. Why shop on some random website when you can have Gemini buy it for you? Why look for information on Wikipedia when… you get the idea.

I think the coming years will be pivotal for the web. Facebook attempted a similar strategy back when their apps got traction, but they ultimately failed. Let’s hope Google fails too.

show 1 reply
WD-42yesterday at 11:06 PM

What I really don't understand is where the next generation of training material will come from. If websites stop being published and/or crawled, how will the machine continue to be fed.

show 3 replies
dyauspitrtoday at 1:09 AM

If they block Google’s crawlers no one visits their site ever.

jjuliusyesterday at 10:34 PM

The long-run doesn't matter as much as the short-term gains for those in power.

properbrewyesterday at 11:53 PM

Is it just an exchange for traffic? I run a website that I'm perfectly happy for a single user to not land on themselves with a browser on their device, if they are provided the information that I'm providing or purchase a service through the AI product it doesn't make a difference to me.

Some websites can run only on ads. Is it such a bad thing that they would die off?

I say this as someone that likes the old web and has fun hitting the "surprise me" button on https://wiby.me/ (not affiliated) and browsing the random sites. Just giving an alternative view.

phendrenad2today at 12:56 AM

Information, correct information, is the new gold. We've seen what LLMs can do with the rubbish heap of information that is available on the current internet. The next step is refined, concise information sources. Think the Encyclopedia Britannica. And not only that, but models trained by experts. Right now everything is cheap and plentiful. Anyone can ask ChatGPT the same question and get the same middling answer. In the future, someone will make a dataset about a subject, train a model on it, and all the big companies and players in that area will pay for it.

hotstickyballsyesterday at 10:31 PM

The web is going to become China, which is a collection of walled gardens

clocheyesterday at 11:52 PM

Is there a way to reliably block Google and AI crawlers?

show 1 reply
archagonyesterday at 11:23 PM

Google ignores robots.txt and botnets residential addresses to crawl anyway? (LLM startups already do this.)

winterbourneyesterday at 10:48 PM

> If Google cuts that out completely, what incentive do websites have to not block the Google crawlers?

Completely, yes, that destroys the incentive. But they can reduce it 80% or 90% or so, to the point that it's just barely worthwhile to allow their crawlers.

hsuduebc2yesterday at 11:20 PM

You will be kept inside the Google ecosystem the same way people are kept inside Facebook.

I’m curious how they plan to generate new content in the future, because it seems obvious that simple web pages will become obsolete and eventually stop being filled with fresh data.

It will probably end with a warning every time you click a link, something like: “You are leaving to an external unsafe site.”

AlienRobotyesterday at 11:08 PM

The impression I get from Google's own marketing material is that Google doesn't believe in "the web". And it hasn't believed in the web for years.

Think about it. Pretty much every time they show a search box with someone asking for directions to reach a physical place, what hours is it open, etc.

The greatest thing about the internet is that it has removed distances around the whole world, but Google's major value proposition seems to be that... it can accurately index and query information about local businesses?