logoalt Hacker News

ekiddyesterday at 7:00 PM21 repliesview on HN

Which as some running a website raises a fascinating question. If Google is just going to crawl my sites and present information as an AI summary on their site, then what exactly do I gain by allowing Googlebot to crawl my sites?


Replies

pflenkeryesterday at 7:11 PM

A couple of years back I worked with a company which maintained specific data which was the main traffic driver on that page. Google approached them and wanted to pay for the rights to get the data and display it on top of the search results, a feature which was fairly new back then.

This was an interesting dilemma because it was very clear that the money was way less than the loss in ad revenue due to traffic drop, but it was also clear that if we wouldn’t take the deal, a more desperate competitor would, which would result in the same traffic loss but without the extra google money. So the company took the deal.

History repeats itself here, with the difference that instead of paying for the data, the ai crawlers simply take it for free.

show 4 replies
wvenableyesterday at 7:37 PM

It's a catch-22. Without google crawling your site, you don't get any new traffic. But with google crawling your site, you also might not get any traffic.

AI summarization has already causes issues for sites like rtings where people are no longer visiting the site but still making use of the data presented there. Leading to rtings not getting enough traffic to continue to post their data.

It is an existential crisis for websites and when they go away it'll be an existential crisis for AI.

show 3 replies
pokot0yesterday at 7:57 PM

Internet is more and more becoming a commercialization platform. If you are selling something on your website, you still want Google (or ChatGPT for that matters) to expose customers to your product. The gate is the actual delivery of the product is behind a purchase/signup. Google and others want to control the entire customer journey, to the point the your website is simply a way to pass metadata to them. They are actually achieving this!

this kills the entire internet vibe of the 90s, early 2k

show 1 reply
lifistoday at 2:51 AM

The expected purpose of websites is to spread information, so whether users get it by making a request to your website or to Google is irrelevant. In fact, if they get it from Google it's better because it reduces website load.

If instead the purpose of your website is to manipulate users for financial gain (for instance by showing media attempting to manipulate their purchasing decisions, after receiving a bribe from a vendor), and the information is just a way to lure users, then maybe this malicious business model will finally be no longer possible.

rdedevyesterday at 10:15 PM

Sites pay good money to appear on top search results. Looks like the future is going to be sponsored AI sources. It's going to be even more difficult to figure out if google is presenting you with actual information instead of just an ad

jefftkyesterday at 8:39 PM

I write things on the internet because I want to share ideas. If someone reads my post and tells a friend, that's great. If an AI crawls my posts and passes along the ideas that's great too.

(It doesn't work for ad-funded writing, but while I have substantial sympathy there this has historically been an unpopular argument on HN)

show 2 replies
nine_kyesterday at 8:58 PM

If your site is all about disseminating information (like Wikipedia), then Google would provide a free mirror of sorts.

If your site is about your product, Google won't be able to serve the sign-up page from AI; the traffic would come your way. Same for a site that sell something: the traffic you're interested in would arrive at your checkout page.

Paid-content sites and ad-supported sites are screwed though, on top of their being screwed by archive.is and ad blockers.

show 1 reply
prinny_yesterday at 8:21 PM

You're allowed to exist on the web. The alternative is you are pushed out, your site is not indexed and google / chrome labels it as a security risk when people are trying to reach it directly. The mandate is clear: give up the data or give up the spot.

deatonyesterday at 7:42 PM

What you gain? Nothing, but they and other AI companies have decided not to respect your robots.txt

show 1 reply
try-workingyesterday at 10:33 PM

That's Google making way for its disruptor. We'll see who that is. Imagine a search engine that just presents search results. Groundbreaking.

show 1 reply
Andrexyesterday at 7:10 PM

Free speculation: I could see a future where Google populates a footer on results with the website logos of the sources. Presumably, the new web will require users to memorize websites/brands and go directly to those sites if they see a lot of their results are being provided by one source.

Websites may go back to being simply labors of love.

show 5 replies
franzeyesterday at 8:13 PM

well its already happening and people are fighting over traffic crumbs already, they call it GEO

victorbjorklundyesterday at 8:42 PM

Maybe you want your ideas to spread? If your sites purpose is getting ad impressions then yea no point. But if your purpose is to spread ideas then it is still useful.

coldpieyesterday at 7:28 PM

> allowing Googlebot to crawl my sites

As far as I know, you don't have a choice. They have no obligation to respect your wishes, and LLMs are legally allowed to scrape & republish your content.

show 2 replies
tedd4uyesterday at 7:29 PM

Vastly less but still more traffic than if you didn’t participate. I’m sure they will calibrate it just so.

whazoryesterday at 7:04 PM

Websites tend to be updated and considered to be the source as well.

gerdesjyesterday at 11:39 PM

(You misspelled someone as some)

Google has always crawled your site and been an arse! Now you get to decide whether they are hallucinating!

You can drop pointers on Masto and other socials to your sites - that has not changed.

Do we need something else? ie you drop a link to somewhere else.

UltraSaneyesterday at 8:20 PM

Can you actually prevent Google from crawling your site?

thedelanyoyesterday at 7:16 PM

> then what exactly do I gain by allowing Googlebot to crawl my sites?

Mention

show 1 reply
essephyesterday at 9:11 PM

> what exactly do I gain by allowing Googlebot to crawl my sites?

Site traffic

swarnieyesterday at 7:20 PM

Allow? Deep down, do you think you have a choice?

Mechanisms might exist to make you think you have one, the same way copywrite should prevent millions of books being gobbled up by TheZuck but ultimately do you really have a choice?

Rules and laws don't exists for you.

show 1 reply