I built something similar but now use Codex instead.
Using the VS Code extension you get dynamic context management which works really well.
They also have a memory system built using reflexion (someone please correct me if I'm wrong) so proper evals are derived from lessons before storing.