alt
Hacker News
Jensson
•
today at 3:44 AM
•
0 replies
•
view on HN
That doesn't test whether the model can follow and execute a dynamic plan reliably.