I'm not sure of the exact relationship, but power consumption increases greater than linear with clock speed. If you have 4 cores running at the same time, there's more likely to be thermal throttling → lower clock speeds → lower energy consumption.
Greater power draw though; remember that energy is the integral of power over time.
By running more tasks in parallel across different cores they can each run at lower clock speed and potentially still finish before a single core at higher clock speeds can execute them sequentially.
Running a program either on 1 core or on N cores, ideally does not change the energy.
On N cores, the power is N times greater and the time is N times smaller, so the energy is constant.
In reality, the scaling is never perfect, so the energy increases slightly when a program is run on more cores.
Nevertheless, as another poster has already written, if you have a deadline, then you can greatly decrease the power consumption by running on more cores.
To meet the deadline, you must either increase the clock frequency or increase the number of cores. The latter increases the consumed energy only very slightly, while the former increases the energy many times.
So for maximum energy efficiency, you have to first increase the number of cores up to the maximum, while using the lowest clock frequency. Only when this is not enough to reach the desired performance, you increase the clock frequency as little as possible.