Are you counting the time/effort to evaluate the accuracy and relevance of an LLM left to "think" for a while?