The stack pointer, esp on i386, denotes the top of the stack. All memory below the stack pointer (i.e. higher addresses) is occupied by parameters, variables and return addresses; memory above the stack pointer must be assumed to contain garbage.
When programming in assembly, it is equally easy to use memory below and above the stack pointer. Reading from or writing to addresses beyond the top of the stack is unusual and under normal circumstances, there is little reason to do so. There are, however, situations — rare situations — where it may tempting to temporarily use memory beyond the top of the stack.
That said, the question is whether it is really just a convention and good style not to grab beyond the stack of the stack or whether there are actually reasons why doing so could lead to problems.
When trying to answer this question, one first has to make a distinction between user mode and kernel mode. In user mode Windows, I am unable to come up with a single reason of why usage of memory beyond the top of the stack could lead to problems. So in this case, it is probably merely bad style.
However, things are different in kernel mode.
In one particular routine I recently wrote, I encountered a situation where temporarily violating the rule of not reaching beyond the top of the stack came in handy. The routine worked fine for quite a while. In certain situations, however, it suddenly started to fail due to memory corruption. Interestingly enough, the routine did not fail always, but still rather frequently.
Having identified the specific routine as being the cuplrit, I started single stepping the code. Everything was fine until I reached the point where the memory above the stack pointer was used. The window span only a single instruction. Yet, as soon as I had stepped over the two instructions, the system crashed. I tried it multiple times, and it was prefectly reproduceable when being single-stepped.
So I took a look at the stack contents after every single step I took. To my surprise, as soon as I reached the critical window, the contents of the memory location just beyond the current stack pointer suddenly became messed. Very weird.
After having been scratching my head for a while, that suddenly started to made sense: I was not the only one using the stack — in between the two instructions, an interrupt must have occured and been dispatched. As my thread happened to be the one currently running, it was my stack that has been used for dispatching it. This also explains why it did not happened always unless I was single-stepping the respective code.
When an interrupt occurs and no privilege-level change has to be performed, the CPU will push the EFLAGS, CS and EIP registers on the stack. That is, the stack of whatever kernel thread happens to be the one currently running on this CPU is reused and the memory locations beyond the stack pointer will be overwritten by these three values. So what I initially interpreted as garbage, actually were the contents of EFLAGS, CS and EIP.
On Windows NT, unlike some other operating systems (FreeBSD, IIRC), handling the interrupt, which involves runing the interrupt service routine (ISR) occurs on the same stack as well. The following stack trace, taken elsewhere, shows an ISR being executed on the stack of the interrupted thread:
f6bdab4c f99bf153 i8042prt!I8xQueueCurrentMouseInput+0x67
f6bdab78 80884289 i8042prt!I8042MouseInterruptService+0xa58
f6bdab78 f6dd501a nt!KiInterruptDispatch+0x49
f6bdac44 f6dd435f driver!Quux+0x11a
f6bdac58 f6dd61db driver!Foobar+0x6f
Morale of the story: Using memory beyond the current stack pointer is not only bad practice, it is actually illegal when done in kernel mode.