> Human attention is truly ephemeral, with a ridiculously short span.
I do not believe this at all. I think you'd have to have a very limited experience working with other human beings to be able to believe this.
> and by training it better.
"Oh yea, just do it _better_. That's your problem." Perhaps some people operate without any context but most of us find the experience lacking.
That's a simple neurologic fact, there's nothing to believe. You can use a trivial experiment to verify that you can't keep details in your sliding attention window for more than a few seconds or focus on more than a few things simultaneously. Human memory and cognition is layered and this immediate layer is what resembles the model's context the most.
You're possibly mixing it up with long-term memory which doesn't keep immediate facts and details, it's for heavily processed and compressed summaries, for the lack of a better analogy from the LLM world. You aren't keeping the entire codebase in your memory, just its highly processed and conceptualized version. This conceptualization can be somewhat emulated as an agentic loop, but it can only go so far, current models quickly lose coherency and aren't good enough to predict what's important.
Models don't need to remember more details, they need stronger processing of what they already remember.
>"Oh yea, just do it _better_. That's your problem."
I think we're talking about different things. Models can be trained better to cram more intelligence into the same amount of parameters, that's what I mean. Similarly to how your ability to learn (and perform) math depends on your prior math training.