I started reading this and right away hit something that doesn't really make any sense to me:
> the extractor. the thing that reads conversation transcripts and decides what to keep.
> the most consequential choice an extractor makes is timing. extract eagerly, after every message, and you spend tokens on small talk that goes nowhere. extract lazily, at the end of a session, and the context you needed to resolve a pronoun is already gone.
If the input is coming from a transcript, then either that transcript contains enough context to understand what a particular pronoun refers to, or it doesn't.
If it does, why would waiting until the end of a session be a problem? What am I missing?
good catch - the example is sloppy. the real issue is lost-in-the-middle on long transcripts: the extracting model attends worse to material between endpoints, so "the transcript is still there" doesn't mean the extraction sees it equally.