I find that thinking/agent mode sometimes makes it worse/comes up with the same thing and just takes a long time. But I’m sure it’ll be different with fable for a few months until that hype blows over
Something a lot of folks struggling with these systems don't get is that the instruction and management of them is often quite important - just because they're capable doesn't mean they're mind readers.
Most of the skepticism I encounter on this front is due to lack of proper direction, process involving planning and review before execution, and appropriate attention given to evaluation and feedback loops.
If you asked the smartest person in the world to YOLO a task with the sort of instruction the average denier uses to evaluate an LLM, you'd likely find they wouldn't get back what they were expecting either - and if you're evaluating on subpar models/tools, you shouldn't be surprised to get subpar results.
Something a lot of folks struggling with these systems don't get is that the instruction and management of them is often quite important - just because they're capable doesn't mean they're mind readers.
Most of the skepticism I encounter on this front is due to lack of proper direction, process involving planning and review before execution, and appropriate attention given to evaluation and feedback loops.
If you asked the smartest person in the world to YOLO a task with the sort of instruction the average denier uses to evaluate an LLM, you'd likely find they wouldn't get back what they were expecting either - and if you're evaluating on subpar models/tools, you shouldn't be surprised to get subpar results.