logoalt Hacker News

reconnectingtoday at 9:02 AM5 repliesview on HN

I have bad news for you: LLMs are not reading llms.txt nor AGENTS.md files from servers.

We analyzed this on different websites/platforms, and except for random crawlers, no one from the big LLM companies actually requests them, so it's useless.

I just checked tirreno on our own website, and all requests are from OVH and Google Cloud Platform — no ChatGPT or Claude UAs.


Replies

michaelcampbelltoday at 12:42 PM

I also wonder; it's a normal scraper mechanism doing the scraping, right? Not necessarily an LLM in the first place so the wholesale data-sucking isn't going "read" the file even if it IS accessed?

Or is this file meant to be "read" by an LLM long after the entire site has been scraped?

show 2 replies
cardanometoday at 9:11 AM

Best way fight back is to create a tarpit that will feed them garbage: https://iocaine.madhouse-project.org/

show 1 reply
Sharlintoday at 1:50 PM

You could insert the message on every single webpage you serve, hidden visually and from screenreaders.

whazortoday at 9:30 AM

what if you add a <!-- see /llms.txt --> to every .html

show 1 reply
GaggiXtoday at 9:25 AM

This is meant for openclaw agents, you are not gonna see a ChatGPT or Claude User-Agent. That's why they show it in a normal blog page and not just as /llms.txt

show 1 reply