OSCP Stack Buffer Overflow — Four ways to completely get rid of NOP (\x90\x90\x90) Slides from your Payloads

9 min readAug 17, 2023

A couple of years ago (in 2016 to be precise) I went into discussion with one of my colleagues who failed his first OSCP examination attempt. The discussion was quickly directed towards his buffer overflow assignment and my colleague complained that he could not have made his payload working, it just constantly crashed. Another colleague participating in the discussion made a comment about likely missing NOP padding. We all looked at him surprised with two obvious questions present on our faces — “What the hell is NOP padding and why its lack of keeps crashing my exploit?”

Over more than seven years later one of younger employees in my organization is also currently busy with preparation to his own first OSCP attempt and raised a similar question. Although OffSec in the meantime limited the space for buffer overflow aspects in their PWK training (please refer to [1]), the NOP slide (aka NOP sequence or NOP padding) problem is still there and can play a significant role during the OSCP examination. In my newest article I will not only explain this “phenomenon” in details but also demonstrate how to generate payloads without any need of this “extra” addition.

— — — — — — - — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Testing Environment

All the payloads generated in this article can be tested using a simple buffer overflow emulation app (bov.cpp) which can be found at [2] along with building and use of instructions for Windows 10 environment.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —- —-

Why do we even need NOP slides?

The problem itself has partially been explained in [3], but I would like to elaborate it in little bit more details. What I would like to emphasize before starting our analysis is that the need for NOP slides comes exclusively from the specific of payloads generated using metasploit framework (msfvenom) and specifically from the way MSF encoders work. This phenomenon has very little to do with the exploited vulnerabilities themselves so we will focus solely here on metasploit payloads encoded with shikata_ga_nai.

Firstly let’s take a look at the code in Figure 1, specifically two consecutive fcmovnu and fnstenv opening our standard x86 reverse shell payload encoded with shikata_ga_nai encoder. There are two important outcomes of their execution:

Address of the instruction fcmovnu (referred as first FPU or first f* instruction later on in this article) is put on top of the stack and as such its address is present in ESP registry for later use in order to decode the whole payload.
28 bytes from [ESP-0x0C] to [ESP+0x0F] get overwritten by the instruction fnstenv (referred as second FPU or second f* instruction later on in this article) — see [4], overwritten bytes are marked in Figure 2.

Keep in mind that the bytes from [ESP] to [ESP+0x0F] are the first 16 bytes of the payload in case of standard vanilla buffer overflow as OSCP students exploit during their PWK training. If we do not take any measures, beginning of our payload will be just ruined at the starting of its execution. One of possible options is NOP padding as suggested during the PWK course. How many NOPs shall we use? Well … if during our buffer overflow exploitation we overwrite return address with the instruction jmp ESP, then we would need 16 bytes (NOPs). If we however use instead an address of call ESP instruction then even 12 bytes would be enough (in both cases 16 NOPs should do the work).

Figure 1. Disassembly of windows/shell_reverse_tcp payload encoded using shikata_ga_nai encoder.

Figure 2. Bytes overwritten by instruction fnstenv.

However this article is not about using NOPs in payloads but rather eliminating them completely. Let’s see what possibilities are available.

Option I (and the best?) or just abusing “bad” characters

Let’s consider above explained fnstenv [esp-0xc] instruction again. Since it belongs to the part of the code responsible for decoding the actual metasploit payload present already on the stack (see Figure 1), this instruction must not be encoded itself. Its four byte opcode is \xd9\x74\x24\xf4. So our solution is very simple — we can just use one of these four bytes as our bad character during the payload generation by msfvenom and the outcome will not include fnstenv instruction at all! As a result our new payload will also not require any NOP slide at all! Lets try this approach with msfvenom and \xd9 “bad” character:

Figure 3. Generation of shell_reverse_tcp payload with “bad” \xd9 character.

And let’s disassemble the beginning of generated payload to confirm lack of fnstenv occurrence (Figure 4).

Figure 4. Encoded *shell_reverse_tcp* payload without *fnstenv* instruction.

I’m leaving execution of the newly generated payload to the reader as an exercise.

There is also a funny part of the story connected to this solution. If the buffer overflow challenge given by OffSec to the student during OSCP examination required avoiding one of the \xd9 \x74 \x24 \xf4 characters, then of course the NOP slide (or any its alternative) was not necessary at all -:) I’m wondering what was the percentage of students attempting the examination and realizing that? Less than 1%? -:)

Option II or just pivoting the Stack Pointer (ESP)

(Ab)using characters from fnstenv [esp-0xc] opcode is a nice and elegant solution to the complete NOP slide elimination, however what can we do when the aforementioned characters cannot be eliminated from the generated payload? Such situation can easily happen when the list of legitimate “bad” characters to eliminate is already long and cannot be extended any further. Lets take a quick look at the Figure 5.

Figure 5. Visualization of pivoting stack pointer (ESP).

The left side of the Figure shows the original position of our payload after overflowing the buffer. The return instruction gets overwritten by jmp/call esp instruction and execution is transferred to our payload (blue) pointed by ESP registry. Somewhere in the payload’s beginning is our fnstenv instruction waiting to crash the execution if no preventive measures (e.g. NOP slide) are taken. But wait a minute … What if we relocate position of the stack pointer (ESP)? Nothing in our payload will then be overwritten and its execution will not be affected at all! How can we do this? In our case it would be enough to just subtract value of 0x10 from ESP before the payload (specifically before fnstenv instruction) execution. We could just do the amendment manually, but lets first take a look at the shikata_ga_nai encoder source code in Figure 6 (in my Kali Linux 2022.3 in the file /usr/share/metasploit-framework/modules/encoders/x86/shikata_ga_nai.rb).

Figure 6. Source code of *shikata_ga_nai* encoder.

What we can find here is opcode of the instruction fnstenv [esp-0xC]! If we only just include in front of it instruction sub esp,0x10 (opcode \x83\xEC\x10) … . Let’s replace \xd9\x74\x24\xf4 with \x83\xEC\x10\xd9\x74\x24\xf4 and generate encoded (with “\x00” bad character) shell_reverse_tcp payload:

Figure 7. Generation of encoded shell_reverse_tcp payload with pivoting stack pointer (ESP).

Figure 8. Instruction *sub esp,0x10* included in the generated payload.

As we can see in Figure 8 after modifying shikata_ga_nai encoder, our sub esp,0x10 instruction is always appended before fnstenv.

Using sub esp,0x10 instruction serves in this section only as an example. Any other instruction (sequence of instructions) modifying ESP registry will work as well.

Option III or just (NOP) Slide again?

Another natural way of the NOP slide elimination from payloads could be (taking already into account explained earlier option II) just prefixing in the original shikata_ga_nai encoder source code the opcode of fnstenv instruction with … a slide of NOPs -:). Eliminating NOP slide by adding NOPs again sounds a bit confusing, but since technically this is a valid option, I’m keeping its description here.

16 characters long slide will fit both scenarios — with overwriting return address using an address of call esp instruction, but also jmp esp as well (see explanation in the section Why do we even need NOP slides?). Since implementation of this method is very similar to the one described in the previous section, I’m leaving its execution and testing up to the reader.

Option IV or a Manual Intervention

Another possible option to consider is a manual modification of the generated payload. Before we however start introducing changes to the payloads, lets remove previous updates introduced to the shikata_ga_nai encoder, generate the payload and examine it again.

Figure 9. Disassembly of windows/shell_reverse_tcp payload encoded using *shikata_ga_nai* encoder.

Because I would like to keep it simple, short and without any deep going analysis of the FPU instructions details, lets limit ourselves to the fact that two FPU instructions (we will call them f* instructions in this section, respectively fcmovbe and fnstenv in Figure 9) ensure presence of the first f* instruction address (fcmovbe in the payload from Figure 9) on the top of the stack after execution of fnstenv [esp-0xC].

Please keep in mind the following:

Payloads generated using msfvenom with encoders like shikata_ga_nai are randomized, what means to us that the payload does not always start with first f* instruction and the second f* instructions does not necessarily directly follow the first one. In Figure 9 we can see instruction mov eax,0x95849f80 between our f* instructions.
Both f* instructions are present “somewhere” (exact location may vary) in the beginning of the payload and the second f* instruction is always fnstenv [esp-0xC].
There are a number of FPU instructions that can serve as the first f* instruction in encoded metasploit payloads. In Figure 9 we can see fcmovbe st,st(3) but in Figure 8 it is ffree st(3).

Taking all above into account, our simple “manual intervention” would require the following steps:

Developing a piece of code dropping address of our first f* instruction on the stack. A call executed just before it could server the purpose perfectly.
Replacing fnstenv [esp-0xC] instruction with a code of the same size not impacting in any way the rest of the payloads (four NOPs can be a good example; “\x90\x90\x90\x90”).

Second requirement can be satisfied by directly modifying original shikata_ga_nai.rb file again. We need to replace “\xd9\x74\x24\xf4” from Figure 6 with “\x90\x90\x90\x90”.

The piece of code satisfying the first requirement can be found in Figure 10.

Figure 10. Pushing address of the next (00000009) instruction on stack.

A big advantage of the aforementioned code in lack of “zero” bytes in its content. Our testing application (bov.cpp) calls strcpy function to overflow the buffer, hence zero bytes are not allowed at all in payloads. The last step of our manual intervention is to localize the first f* instruction in newly generated payload and append our code from Figure 10 (“\xeb\x02\xeb\x05\xe8\xf9\xff\xff\xff”) in front of it. Example of a fully crafted payload sending reverse TCP shell to the address 127.0.0.1:443 can be found in Figure 11 and its full code in [2].

Figure 11. Example of (NOP Slide’s free) manually crafted payload sending reverse TCP shell to 127.0.0.1:443 (amendments red framed).

Conclusions

(Vanilla) Stack buffer overflow subject is present in OSCP (PWK) training practically since their offering became available to the students. Although it has been limited in the scope of the latest PWK version (see [1]), Offsec still teaches technicals of this vulnerability based on SyncBreeze v10.0.28 exploit (see record 42928 in exploitdb) which includes NOP slide in its source code. As a result the subject remains (so far) important to the OSCP students and definitely deserves more attention. I hope my article helps students in situation where this attention is still missing in the original PWK training.

Bibliography

[1] Offsec PWK Syllabus (https://www.offsec.com/wp-content/uploads/2023/03/pen-200-pwk-syllabus.pdf)

[2] Marcin Wolak — Simple Vanilla BoF Testing Application (https://github.com/marcin-wolak/bov/blob/master/bov.cpp)

[3] Whitney Travis — Why do you need NOPs? (https://traviswhitney.com/2019/02/27/why-do-you-need-nops/)

[4] fstenv:fnstenv x86 instruction (https://www.felixcloutier.com/x86/fstenv:fnstenv)