What about at inference time? ie. in response to a query. We let models write code and run it. Whi...

cadamsdotcom • yesterday at 10:47 PM • 1 reply • view on HN

What about at inference time? ie. in response to a query.

We let models write code and run it. Which gives them a high chance of getting arithmetic right.

Solving the “crossing the river” problem by letting the model create and run a simulation would give a pretty high chance of getting it right.

Centigonal • today at 5:54 AM

The newest Claude update comes with a python sandbox built right into the API for exactly this reason.

alt Hacker News