Yeah this is about the worst way you could imagine to evaluate an AI model.
If you’d given it a real task you’d have been impressed.
I was floored by the day I spent with Fable. Got weeks of work done.
Same. It was one shotting unbelievably well compared to 4.8.
Oh, I was also quite happy with Fable. I was just answering the question asked.