What is the evidence for 1) ? I thought that the latest models were getting "somewhere" with fairly trivial reasoning tests like ARC-1
It may be that you can just find the solution for these tests by interpolating from a very large dataset.
It may be that you can just find the solution for these tests by interpolating from a very large dataset.