If your output schema doesn’t capture all correct outputs, that’s a problem with your schema, not th...

supermdguy • today at 4:40 PM • 1 reply • view on HN

If your output schema doesn’t capture all correct outputs, that’s a problem with your schema, not the LLM. A human using a data entry tool would run into the wrong issue. Letting the LLM output whatever it wants just makes it so you have to deal with ambiguities manually, instead of teaching the LLM what to do.

I usually start by adding an error type that will be overused by the LLM, and use that to gain visibility into the types of ambiguities that come up in real-world data. Then over time you can build a more correct schema and better prompts that help the LLM deal with ambiguities the way you want it to.

Also, a lot of the chain of thought issues are solved by using a reasoning model (which allows chain of thought that isn’t included in the output) or by using an agentic loop with a tool call to return output.

Replies

dhruvbird • today at 4:55 PM

This ^^^^

While the provided schema has a "quantity" field, it doesn't mention the units.

<code>

class Item(BaseModel):

    name: str

    price: float = Field(description="per-unit item price")

    quantity: float = Field(default=1, description="If not specified, assume 1")

class Receipt(BaseModel):

    establishment_name: str

    date: str = Field(description="YYYY-MM-DD")

    total: float = Field(description="The total amount of the receipt")

    currency: str = Field(description="The currency used for everything on the receipt")

    items: list[Item] = Field(description="The items on the receipt")

</code>

There needs to be a better evaluation and a better provided schema that captures the full details of what is expected to be captured.

> What kind of error should it return if there's no total listed on the receipt? Should it even return an error or is it OK for it to return total = null?

Additionally, the schema allows optional fields, so the LLM is free to skip missing fields if they are specified as such.

alt Hacker News

Replies