> whereas the teams are allowed to bring a 25-page PDF
This is where I see the biggest issue. LLMs are first-and-foremost text compression algorithms. They have a compressed version of a very good chunk of human writing.
After being text compression engines, LLMs are really good at interpolating text based on the generalization induced by the lossy compression.
What this result really tells us is that, given a reasonably well compressed corpus of human knowledge, the ICPC can be view as an interpolation task.
You can use the same framing for human reasoning except its over visual/auditority/spatial data and not just text.
You don't remember every detail of what you've seen correct? You store some lossy compression like "I went to a park"
If we develop a system that can:
- compress (in a relatively recoverable way) the entire domain of human knowledge
- interpolate across the entire domain of human knowledge
- draw connections or conclusions that haven't previously been stated explicitly
- verify or disprove those conclusions or connections
- update its internal model based on that (further expanding the domain it can interpolate within)
Then I think we're cooking with gasoline. I guess the question becomes whether those new conclusions or connections result in a convergent or divergent increase in the number of new conclusions and connections the model can draw (e.g. do we understand better the domains we already know or does updating the model with these new conclusions/connections allow us to expand the scope of knowledge we understand to new domains).