Red Zone and SIMD instructions in kernel
I'm following Writing OS in Rust and after reading the blog post about Minimal Rust Kernel I learned two things.
Red zone
First is something called Red Zone. Red Zone is a region of the memory in the stack which can be used by a function to store temporary variables. This region is placed just after the stack pointer and is 128 bytes long. Note that the stack pointer is not adjusted here. This is an optimization technique which safes instructions for adjusting the stack pointer. It is used for leaf functions (functions which don't call anything else).
The Red Zone is pretty problematic when exception or hardware interrupt appear. The exception handler will use the Red Zone (because stack pointer is not "owning" this region) and will override data which is used by leaf function. Then, when the control is returned to the function, it will get broken data.
You may ask why this optimization is used in the first place if it causes such issues? The thing is that it's not an issue in user space programs, because kernel will switch to totally different stack while handling an interrupt. It's not the case in kernel, that's why you need to turn the Red Zone off while implementing your own kernel.
More about that can be read here.
SIMD
The next thing is the SIMD instructions in the kernel and negative impact on the performance. That's pretty interesting as SIMD (Single Instruction Multiple Data) is a technique used by modern CPU which is supposed to make CPU bound computations faster, not slower. To understand the root cause of the performance impact we first need to tell more about SIMD.
SIMD is a technique which allows to execute the same instruction on multiple data at the same time. There are various standards of SIMD which differs in the number of bits per register and number of registers:
- MMX — Multi Media Extension — 64 bits per register, 8 registers available.
- SSE — Streaming SIMD Extensions — 128 bits per register, 16 registers.
- AVX — Advanced Vector Extensions — 256 bits per register, 16 registers.
Now, why is this an issue in the kernel? Every time the hardware interrupt appears, the kernel needs to save the registers to be able to restore it later. As you can see above, saving all SIMD registers can be a pretty big task, and it takes simply too much time.
That's why we need to disable SIMD support when we are developing the kernel. The problem is that in x86 architecture, floats are using SSE by default. Fortunately, we can instruct LLVM to use software functions to emulate floating-point instructions.
More on that can be found here.