> AI systems already exist that are superhuman on breadth of knowledge at undergrad understanding depth
Two problems with this:
1. AI systems hallucinate stuff. If it comes up with some statement, how will you know that it did not just hallucinate it?
2. Human researchers don't work just on their own knowledge, they can use a wide range of search engines. Do we have any examples of AI systems like these that produce results that a third-year grad student couldn't do with Google Scholar and similar instructions? Tests like in TFA should always be compared to that as a baseline.
> new science should be discoverable in fields where human knowledge breadth is the limiting factor
What are these fields? Can you give one example? And what do you mean by "new science"?
The way I see it, at best the AI could come up with a hypothesis that human researchers could subsequently test. Again, you risk that the hypothesis is hallucination and you waste a lot of time and money. And again, researchers can google shit and put facts together from different fields than their own. Why would the AI be able to find stuff the researchers can't find?