> This started breaking down after ~30 files.
Codex's short context and todolist system combined somehow helps here though. Because of the frequent compact. The model was forced to recheck what todo list item has not done yet and what workflow skill it has to use. I used to left it for multi hour to do a big clean up and it finished without obvious issues.
Is Codex willing to do "multi hour" tasks when used with a ChatGPT Plus subscription, or does it need something more expensive like Pro?