The ceiling will soon be super-human.
What do you base this on? For me it is almost impossible to guess what fits into the context of an llm. Sometimes trivial tasks fail, sometimes quite complex things get one shotted.
What do you base this on? For me it is almost impossible to guess what fits into the context of an llm. Sometimes trivial tasks fail, sometimes quite complex things get one shotted.