Pages

3.22.2012

The Stack! (Leave your hammer at home) pt. 1

Today's topic? The stack. Sure, seems innocent and simple enough, but is it? Well if you want to go on the basic premise of "you put arguments on the stack" and leave it at that, I guess you can call it simple, but that's not the only thing to describe it as. The stack has a history and can be quite complex at times.

Well to start off with the obvious, the stack is the primary location for arguments of functions under the _STDCALL_ declaration (under x86 this is true, amd64 call method uses a few registers first, then uses the stack for any additional arguments, but I digress).

Therefore:
MessageBoxA(NULL, "Text!", "Caption!", MB_OK);

Turns into:
push MB_OK
push OFFSET Caption
push OFFSET Text
push 0
call <&user32.MessageBoxA>


So as you can see, its primary purpose is the conveyance of arguments, but what else is its purpose?

Well it also sets up the stack frame for any local arguments within a function. So let's say we have this slice of code.

#include <stdio.h>
#include <stdlib.h>

int SimpleFunction(int);

int main(int argc, char* argv[])
{
    printf("5+5: %d\n10+5: %d", SimpleFunction(5), SimpleFunction(10));
    exit(0);
}

//Simply add 5 to the number and return the number
int SimpleFunction(int n)
{
    //Store n + 5 into a local variable
    int val = n + 5;
    return val;
}


This simply prints a number plus 5 for both 5+5 and 10+5, simple enough, right? Well our variable int val inside of SimpleFunction is stored on the stack, not in a data heap, which makes this chunk of fun volatile.  So you can't count on it living outside of this call. It has a short lifetime, and can be seen in the stack inside of said call.

Now let's compile this source, and view it in OllyDbg to get it's full effect!

So here is our disassembled SimpleFunction to assembler, which isn't a whole lot of code, but that's alright!
So we are sitting on execution before even pushing ebp. So you'll get the full effect!
Here is our stack frame at our current point of execution.

So seeing as how the stack frame still needs to be set up for this call, we are currently pointing to our return address.

Our first instructions are push ebp, which merely stores the prior stack frame pointer onto the stack for later retrieval, as ebp is used to refer to any arguments and local variables once inside the local frame. mov ebp,esp stores the current stack pointer into ebp, and sub esp,0x10 merely subtracts 16 bytes from esp for argument's use if the function per-chance has more function calls in it. When ebp is saved, this is the reason why it is done. Let's say a() calls b() inside of it, we still have to have access to our local variables and the like upon re-entry to a() after b() completes! This was the idea of pushing the old ebp value and it being restored on exit either via an add esp,XX and pop ebp, or as you will see in the image, a leave instruction. This simply readjusts the stack and restores ebp for you.

Now less ranting and more images!
So this is the image of the stack after the instructions push ebp; mov ebp,esp; sub esp,0x10;
As you can see, we now have space on our stack for local variables! The register ebp is pointing to 0x0060fe08, which means that [ebp+4] would be none other than our return address, and [ebp+8] would be our argument! It can usually be stated, that usually [ebp] is the old ebp value for the prior stack frame, [ebp+4] is the return address, and anything [ebp+XX] where XX is a value 8 or higher (in increments of 4) is an argument being accessed. This is of course assuming you are using 32 bit system of course :).

Now let's take a look at what happens on the flip side of things! When things like [ebp-4] are accessed! In the code above, there are two instructions that access [ebp-4], and those are at 0x004013f8 and 0x004013fb. What [ebp-4] is actually accessing is the local variable val!

So to put a long story short, a stack frame is created anytime a call is entered, and the prior stack frame is preserved upon entering another function call!

If we have functions a(), b(), and c() and they are called in their respective order, the stack frame will end up looking like this! (also please excuse my poor drawing skills if you'd be so kind, thank you)

[[edit: YIKES! I forgot that the current address of ebp in a stack frame actually points to it's old value, not the RET ADDR! Sorry folks! Just pretend like I got everything right the first time :). Funny considering I got it right in explaining [ebp+4] up above...Oh well!]]


Remember, that even though the stack grows up, it actually grows down. Yes I said down! The addresses for the stack actually decrease as the stack grows up. (This is a setting in the Intel processor in the GDT entry table, which can have a granular data entry to grow down, but all that at a later time!)

The stack frame bit on the very top of my horribly done drawing is a top down look at the stack frame, in the same view as the stack (grows up, but addresses decrease). The esp address actually is set either manually (push ebp; mov ebp,esp; sub esp,XX) or can be done via an enter instruction (which takes care of the stack manipulation to enter a stack frame automatically making it less bytes instruction wise, but is about double the clock cycles of manually setting it up). However upon execution of the call instruction, the RET ADDR is auto-magically pushed onto the stack, and to speak of the arguments...well those were placed their by either you or your compiler :).

Well that concludes this part of The Stack! (Leave your hammer at home)! Just getting a basic feel for the stack is super important for x86 (and I can also say a few other assembler languages too :P), and especially exploit development. The stack is a tricky mistress at first, but follows laws of logic much like Newtonian laws of physics.

So long, and until next time folks!

3.21.2012

Coming Soon!

So! Here I am again! I am currently working on an entry to the stack type article for this, but I plan on doing more than just entry articles! I am quite interested in more advanced aspects, such as SafeSEH, and perhaps looking at some real world exploits in depth!
As far as projects go, I am considering working on an ncurses debugger (thus making it portable! Or at least to my hopes of it being portable!) for OSX, win32, win64, and Linux. Seems hopeful but I think a common debugger for various operating systems would be a nice touch! (A free one at least)
The debugger is mostly a thought, not something I have commited to yet, but I would enjoy delving deep into operating system internals!
I also plan on doing an article breaking down the various aspects of OllyDbg and explaining shortcuts! Shortcuts are named aptly so, and are a nice touch in improving efficiency!
So that's a quick summary of what I have in store in the near future! I would cover IDA, but unfortunately I do not own the non-free version (5.0 I think).
Anyway, until next time!

3.20.2012

The Beer Battered B0f (and friends)

Hello and welcome, to the world of tomorrow! Okay, so buffer overflows aren't exactly the most NEW thing hanging around, the little devils were discovered in the 90s by a bunch of Unix hackers, but you know, that's okay. We are here today and now, to rediscover the magic (for win32)!

Things you probably need!

  • A computer!
  • mingw32 cross compiler! (sudo apt-get install mingw32)
  • A debugger! (I use OllyDbg as my all time favorite 3 time Emmy winner)
  • If you are doing this under Linux like myself, get yourself Wine! (sudo apt-get install wine)


Here, my friends, is your first C file! Simple, yet effective, as we are not going to hop right into the stack based overflows, but rather, the heap based overflows, with a few toppings! Heap based overflows are a tad different than your traditional stack based overflows, in the sense that instead of attempting to overwrite a local variable placed on that ever so loving stack, we have a global variable in the '.data' section! Now low and behold there's also a chance a 'clever' programmer may have placed a lovely dword value JUST after this global-variable-turned-string, and if so we may have ourselves some fun IF this dword is a resolvable address that can be called later! Let's see if we can whirl this out, in a lovely source!

(Viewer's discretion is advised, the following source file was conceived on a Linux operating system with mingw32. Not for the faint of heart.)



#define _WINNT_WIN32    0x501
#include <windows.h>
#include <stdio.h>


char my_message[16];
int WINAPI (*my_api_call)(HWND, char*, char*, unsigned int);


int main(int argc, char* argv[])
{
        //Load user32.dll into address space, and
        //then use win32api to return the function
        //address for MessageBox
        HANDLE lib = LoadLibrary("User32.dll");
        my_api_call = GetProcAddress(lib, "MessageBoxA");
        printf("Hello and welcome! Please try to insert a 16\n");
        printf("character only string! Please don't hack me!\n");
        //Get the string from user input, with NO CARE about it's
        //length, demeanor, or intent!
        gets(my_message);
        //Now we just call MessageBox with our string from user
        //blindly skipping through a meadow.
        my_api_call(NULL, my_message, 
"UserInput!", MB_OK);
        //Free our library from the address space for 
        //good measure, then exit
        FreeLibrary(lib);
        exit(0);
}



So there we have it folks! A seemingly innocent software bug no one would EVER think about hijacking and using to dial back to a server to upload malicious software onto the host without the user's consent! Why no one would ever even dream of that!

So let's get a rundown of what exactly all this ruckus is about! Well we have a program here that simply asks for user input, then displays it in the form of a message box, like so!



Well look at that! Seems simple enough! Just throw whatever string you want in there, and it shows up on the MessageBox! But let's say we become a wee bit 'evil' and go against what this lovely program is telling us NOT to do. We are going to bypass the 16 character limit with 20 characters! Oh dear!


Well this looks a bit bad if you notice that our access violation occurs at 0x21657265! Well that seems to be an odd address in the first place! Let's see if a bit of python can't clear up this horrible mess!



$python -c 'print("\x21\x65\x72\x65")'
!ere



Well! Seems to be that the address it was attempting to call was none other than the string "!ere". This means that it took the last four characters and tried to use that as the address to call! (Remember though that even though it appears backwards, and you are probably saying "Jeez David, are you crazy? It's backwards!", this is little-endian my man!)


So what does this all mean? All this hokum and nonsense? Well this means that one could easily derail your application and use it for evil and malicious purposes! That's no good if you're trying to make a living! So let's dissect this for a bit real fast...



  1. The calling program, looking at the source, uses win32api functions 'LoadLibrary' and 'GetProcAddress' to get the address from the export table, of the function 'MessageBoxA'.
  2. This is then placed in a variable that follows immediately after a 16 character buffer, in all it's glory!
  3. Prints text to the screen in hopes the user will be gentle with it.
  4. Grabs input from the user without sanitizing/limit checking it at all.
  5. Shoves that string in as an argument in a 'MessageBoxA' call, and then just jumps willy-nilly into the fray with a 'call near eax' (with eax containing the value of my_api_call).
  6. 'FreeLibrary' of loaded user32.dll.
  7. 'exit(0)' to close it all up!
So that is the basic flowchart of this program's lifetime. Simple, but one inherent and huge problem. The buffer overflow. Since we were able to crash this program without so much as a care in the world, this could lead to severe problems down the road! For example, what if I were to simply just craft the user input to do this...

Hex: (of which I merely edited inside of OllyDbg, no 'cat' to stdin this time!)
68 28 40 40 00 E8 B2 D5 FF FF EB FE 90 90 90 90 14 40 40 00 49 20 65 6E 6A 6F 79 20 68 61 63 6B 69 6E 67 21


Hmmm? Well let's see the outcome shall we?

Well, well, well! Would you look at that! I do enjoy hacking! So all that lovely hexadecimal did was call 'puts' with the string I made in there 'I enjoy hacking!'. Since 'puts' was already imported by the linker and included as an IAT stub to MSVCRT, all we simply had to do was call the <jmp &msvcrt.puts> and we are in the clear!

So, in conclusion, we learned about the lovely heap based buffer overflow, and all of it's simple, yet effective mechanics. This just goes to show that information security is a vital role in today's market. It's hard to stay one step ahead of today's hackers, and they are skilled individuals, but it takes a hacker to beat a hacker!

Until next time folks!