Humans writing tests can only help against some subset of all problems that can happen with incompetent or misaligned LLMs. For example, they can game human-written and LLM-written tests just the same.