logoalt Hacker News

cadamsdotcomyesterday at 10:47 PM1 replyview on HN

What about at inference time? ie. in response to a query.

We let models write code and run it. Which gives them a high chance of getting arithmetic right.

Solving the “crossing the river” problem by letting the model create and run a simulation would give a pretty high chance of getting it right.


Replies

Centigonaltoday at 5:54 AM

The newest Claude update comes with a python sandbox built right into the API for exactly this reason.

https://docs.anthropic.com/en/docs/agents-and-tools/tool-use...