You mean literally assign a grade, like B+? This is unlikely to work based on how token prediction & temperature works. You're going to get a probability distribution in the end that is reflective of the model runtime parameters, not the intelligence of the model.
the gpt-5 reasoning models do not have a configurable temperature.
There's a reason why reasoning models are bad for creative writing. The thinking constrains the output.