At most you could do multiple choice questions, or questions with a single objective answer such as "what is the capital of France". But such tests aren't very helpful to learn new things.
Students cannot grade their own answers for more complex problems, they make mistakes but say they are correct since they don't understand the material, or they are correct but since it doesn't say the exact same thing as the example answer they say they are wrong and correct it even though it was already correct. And an LLM would be even worse than that at correcting tests.
This is not a bias against you, its just a general thing that applies to all students and people. Nobody is good at correcting their own work, even the most esteemed professor gets his work checked by other people. And an LLM is not another person here, they aren't good enough to check your work.
Note its much harder to accurately grade an answer than to answer a question.
You are so incorrect here. If you feed it an evaluation rubric it will go after it! You really understimate the technology when in good and well intentioned hands. (Yes lazy people will use it to cheat and get to the answer faster. But it is like a grad student tutor who will do an evaluation in 30 seconds! Rapid iteration and laddering up...)