> Do they know when the GenAI is bullshitting them?
Anecdote from a friend who teaches CS: this year a large number of students started adding unnecessary `break` instructions to their C code, like so:
while (condition) {
do_stuff();
if (!condition) {
break;
}
}
They asked around and realized that the common thread was ChatGPT - everyone who asked how loops work got a variation of "use break() to exit the loop", so they did.Given that this is not how you do it in CS (not only it's unnecessary, but it also makes your formal proofs more complex) they had to make a general one-time exception and add disclaimers in exams reminding them to do it "the way you were taught in class".
A colleague of mine once taught a formal methods course for students working on their masters -- not beginners by any stretch.
The exercise was to implement binary search given the textbook specification without any errors. An algorithm they had probably implemented in their first-year algorithms course at the very least. The students could write any tests they liked and add any assertions they thought would be useful. My colleague verified each submission against a formal specification. The majority of submission contained errors.
For a simple algorithm that a student at that level could be reasonably expected to know well!
Now... ChatGPT and other LLM-based systems, as far as I understand, cannot do formal reasoning on their own. It cannot tell you, with certainty, that your code is correct with regards to a specification. And it can't tell you if your specification contains errors. So what are students learning using these tools?
If you take the generated code snippets and ask something like "There may or may not be something syntactically or stylistically wrong with the following code. Try to identify any errors or unusual structures that might come up in a technical code review.", then it usually finds any problems or at least, differences of opinion on what the best approach is.
(This might work best if you have one LLM critique the code generated by another LLM, eg bouncing back and forth between Claude and ChatGPT)
You take a few points from the students that posted inane code by following the LLM, and those students will learn to never blindly follow an LLM again.
>use break() to exit the loop
Well - they know that break is not a function and you don't. Thanks ChatGPT.