logoalt Hacker News

tsazanyesterday at 2:52 PM2 repliesview on HN

That is the current standard. But it is hard for agents to read efficiently. To access JSON-LD, an agent must download the entire HTML page. This creates a haystack problem where you download 2MB of noise just to find 5KB of data.

Even then, you pay a syntax tax. JSON is verbose. Brackets and quotes waste valuable context window. Furthermore, the standard lacks behavior. JSON-LD lists facts but lacks instructions on how to sell (like @SEMANTIC_LOGIC). CommerceTXT is a fast lane. It does not replace JSON-LD. It optimizes it.


Replies

inerteyesterday at 5:19 PM

Wouldn't be easier on everybody (servers and clients) to just expose Structured Data in a text file then? And add the 1 or 2 things it doesn't have?

show 1 reply
tjhorneryesterday at 5:36 PM

Who says you need to pipe the entire document with JSON-LD directly into the context window? I agree, that is very wasteful. You can just parse the relevant bits out and convert the JSON-LD data into something like your txt format before presenting it to the LLM. Bake that right into whatever tool it uses to scrape websites.

show 1 reply