I run a medialab at an university. ESP32 is great, but there are some downsides that are all not dealbreakers, but can in some cases lead me to recommend a classic Arduino-type device:
1. Lack of 5V tolerant pins. Beginners may or may not be aware of the possibility of destroying the device or the need to level-shift signals.
2. Tooling may not work out of the box. As of today the tooling step boils down to pasting a URL into a field in the preferences, but that is something you need to know. You need to select the right uploading options which are much more complex than with arduino type devices.
3. IMO less clear naming of different dev boards, thus also harder to find docs.
4. Examples may not work out of the box, simple Arduino examples may fail with hard to debug issues (for beginners) where they don't know whether it is a hardware issue, wrong board/uploader setup or a pinout issue (e.g. if the onboard LED pin differs).
These are all examples of issues students had when they used the ESP32 boards without my guidance, so not just my opinion or a theory. And as I said none of these are dealbreakers, but depending on the patience, stress levels, perceived skill etc. of the student this might make me recommend an Arduino over an ESP32.