Tasks like reversing a list (Karpathy) or counting categories within in are far harder than simple prediction - the one thing LLMs are built to do.
Try it for yourself. Try it on a local model if you are paranoid that the cloud model is using a tool behind your back.