Having had some experience teaching and designing labs and evaluating students in my opinion there is basically no problem that can't be solved with more instructor work.
The problem is that the structure pushes for teaching productivity which basically directly opposes good pedagogy at this point in the optimization.
Some specifics:
1. Multiple choice sucks. It's obvious that written response better evaluates students and oral is even better. But multiple choice is graded instantly by a computer. Written response needs TAs. Oral is such a time sink and needs so many TAs and lots of space if you want to run them in parallel.
1.5 Similarly having students do things on computers is nice because you don't have to print things and even errors in the question can be fixed live and you can ask students to refresh the page. But if the chatbots let them cheat too easily on computers doing hand written assesments sucks cause you have to go arrange for printing and scanning.
2. Designing labs is a clear LLM tradeoff. Autograded labs with testbenches and fill in the middle style completetions or API completetions are incredibly easy to grade. You just pull the commit before some specific deadline and run some scripts.
You can do 200 students in the background when doing other work its so easy. But the problem is that LLMS are so good at fill in the middle and making testbenches pass.
I've actually tried some more open ended labs before and its actually very impressive how creative students are. They are obviously not LLMs there is this diversity in thought and simplicity of code that you do not get with ChatGPT.
But it is ridiculously time consuming to pull people's code and try to run open ended testbenches that they have created.
3. Having students do class presentations is great for evaluating them. But you can only do like 6 or 7 presentations in a 1 hr block. You will need to spend like a week even in a relatively small class.
4. What I will say LLMs are fun for are having students do open ended projects faster with faster iterations. You can scope creep them if you expect expect to use AI coding.