Did you compile with optimisations? I think GCC will do a bunch of activity on the stack with -O0, but it'll generally coalesce everything into one push/pop per function with optimisations (not because of any rule, but just because it's faster). alloca and other dynamic stack allocation may break this, but normal variables should in pretty much all just get turned into one block on the stack (with appropriate re-use of space if variable lifetimes don't overlap)
Yes
It will generate code to touch each page of the stack, because otherwise a very large stack allocation controlled by users (eg, in the case of a variable sized array) can be turned into a pointer to any location in memory by an attacker. Faulting in each page of the stack turns that into a crash.
There was a userspace thread library I came across a long time ago that used variable length arrays to switch between thread stacks; the scheduler would allocate an array of the right size to bump the stack pointer to the different thread's stack.