Mctrain's Blog

What I learned in IT, as well as thought about life

Blind Return Oriented Programming (BROP) Attack (1)

| Comments

This topic consists of 3 sessions:


Blind Return Oriented Programming (BROP) Website

Hacking Blind: paper and slide


This paper is surely one of the best security papers I’ve ever read these years! Cool, fabulous, terrific, gorgeous, superb, sensuous…I would like to use any “good” word I can tell to describe it. And what’s more, the attack code is open sourced! I’ve read and talked about so many times about ROP attack, but actually I’ve never conducted or tried one in reality myself, but this time, I not only tried the “legendary” ROP attack, it is even the Blind one!

In this page, I will assume you have basic knowledge about Return Oriented Programming (ROP), so that I just explain what the BROP is and how they conduct it.

This BROP is really cool and smart, I hope I can throw light on it and make you clear.

Goal of BROP

  • Remotely ROP attack one program even when we don’t know both the source code and binary of this program. The remote server may be protected by NX, ASLR, PIE and stack canaries, and the system may be 64-bit rather than 32-bit.

When you first see this, you must find it is notoriously hard to accomplish such goal. However, there’s also 2 assumptions:

Attack Requirements

  • There’s at least one stack vulnerability, and the attacker has the knowledge of how to trigger it.
  • Server process will respawns after crash, and the respawn process will not re-rand (which means rerandomize its address space). This is reasonable, it holds for examples like nginx, MySQL, Apache, OpenSSH, Samba, etc.

Attack Procedure

Remotely dump memory

Since we don’t know the memory layout of the victim program, the first thing we need to do is to find an approach to dump the memory to our screen, thus we need to call a write system call with a socket file descriptor argument, that is:

write(int sock, void *buf, int len)

To tranform it to the assembly code, we can simply get following 4 pieces of gadgets:

write gadgets

So for this mission, what we need to do is finding these 4 gadgets, and constructing stack to provide appropriate values and invoke them sequentially.

Let’s now stop here, since currently we still have no idea about what the memory layout looks like, especially when the remote system are armed with ASLR and stack canaries.

Breaking Stack Canaries

How? One possible way is “Brute-forcing”, but that’s inefficient, and here the authors propose a so called “stack reading” approach.

Suppose this is the original layout of the compromised stack:

stack layout

We can try as many times as possible to find the overflow length (until crash due to canary tamper, here is 4096+8=4104), then we fill the 4096 bytes with arbitary value, and leak canary by sequentially overwriting a single byte with value x, if x is correct, than the server will not crash, we repeat x for all possible 256 byte values until it is found, till we find all 8 canary bytes, the process is shown as follows:

stack reading

We can also use this approach to read the saved frame pointer and return address.

Finding the stop gadget

Till now we already have the appropriate canary to bypass the stack canary protection, the next task is to find the mentioned gadgets.

Before exploring for these specific gadgets, there is one type of gadget we need to get: stop gadget.

As we know, when we overwright the return address to some guessed memory address, most of the time the program will crash, e.g., the return address point to code with a null pointer dereference, then the connection will close. However, there’s another situation, where the return address point to code which causes an infinite loop, so the program will just hang, and the connection will still stay open. This kind of gadget, the so called stop gadget, is fundamental to finding other gadgets.

Finding the potentially useful gadgets

Suppose we have a stop gadget that would cause the program to block, like an infinite loop or a blocking system call (e.g., sleep), then how can we find other potentially useful gadgets? (here for useful, I mean the gadgets that have some potential functions, other than the ones which will cause crash.)

Since now we can only manipulate the stack, the only things we can do are overwriting return addresses. Suppose if we guess an address of a useful gadget like pop rdi; ret, the application will still likely crash because it will eventually attempt to return to the next word on the stack, likely an invalid address, and this crash would cause us to discard the gadget classifying it as useless, as shown in the figure:

useful gadget but crash

However, if we have stop gadget as we mentioned above, it will be quite different. If we fill enough stop gadgets after the return address we want to probe, as follows:

stop gadgets usage

Then any crash gadget will cause program crash and be marked as useless, while any potential useful gadget will not cause crash. Nevertheless as you can see, there’s still a word called “potential”, which means there’s one exception, that is “stop gadget”. It is possible that the gadge we probe is also another stop gadget, however it never matters, since we will check later to assure the gadgets that we need.

Dumping memory remotely

As of now, everything seemed ready, then let’s see how to find and validate the 4 pieces of gadgets as we proposed at the beginning:

find write gadgets

As shown in the figure, to find the first 2 gadgets: pop %rsi; ret, pop %rdi; ret, we just need to find the so called BROP gadget, which is very common as it restores all callee saved registers. Misaligned parses of it yield a pop %rdi and pop %rsi. Unfortunately pop %rdx; ret gadgets are rare, so the optimization proposed by the authors is to find a call to strcmp instead which sets %rdx to the length of the string being compared, and together with finding call write, we can find them in the program’s Procedure Linking Table (PLT).

  • Finding BROP Gadget

Actually BROP gadgets are very special, since it sequentially pop 6 values from stack and ret. So if we use the approach with stop gadget mentioned above, we can fill 6 crash addresses before the stop gadget address:

find brop gadget

if any potential useful gadget fullfil such condition, it is very likely one of the BROP gadgets.

  • Finding PLT Entries

The PLT is a jump table at the beginning of the executable used for all external calls (e.g., libc). It has a very unique signature: each entry is 16 bytes apart (16 bytes aligned) and the slow path for each entry can be run at an offset of 6 bytes:

plt structure

What’s more, most of the PLT entries will not cause a crash regardless of arguments because they are system calls that return EFAULT on invalid parameters. One can therefore find the PLT with great confidence if a couple of addresses 16 bytes apart do not cause a crash, and can verify that the same addresses plus six do not cause a crash. And these addresses are also the first to have valid code as they are early on in the executable’s address space.

So when we find one of the PLT entries, what we need to do is check if it is strcmp or write one.

For strcmp, the authors validate it by exercising it with different arguments and seeing how the function performs. Thanks to BROP gadget, we can control the first two arguments passed to strcmp, and it will have following behavior and signature:

arg1 arg2 result
readable 0x0 crash
0x0 readable crash
0x0 0x0 crash
readable readable nocrash

Based on this signature, we have high possibility to find the strcmp PLT entry.

And for write, though there’s not any similar signature, we can still achieve it by scanning each PLT entry and forcing a write to the socket and checking whether the write occurred.

The only complication is figuring out the file descriptor number for the socket. There are two approaches: chaining multiple writes each with different file descriptor numbers in a single ROP chain, or opening multiple connections and using a relatively high file descriptor number in hope that it will match one of the connections.

At this point the attacker can write the entire .text segment from memory to the attacker’s socket, disassemble it, and find more gadgets. The attacker can also dump the symbol table and find useful functions in the PLT like dup2 and execve.

Doing exploit

So the most challenge part has been solved, the subsequent process will be (simply from paper):

  • Redirect the socket to standard input/output. The attacker can use dup2 or close, followed by either dup or fcntl(F_DUPFD). These are often in the PLT.
  • Find /bin/sh/ in memory. An effective technique is to find a writable memory region like the environment, environ, from the symbol table, and read /bin/sh from the attacker’s socket to that address.
  • execve the shell. If execve is not in the PLT, the attacker will need to transfer more of the binary to find a pop rax; ret and syscall gadget.

So in conclusion provided by the authors, the BROP attack is as follows:

  • Find where the executable is loaded. Either 0x400000 for non-PIE executables (default) or stack read a saved return address.
  • Find a stop gadget. This is typically a blocking system call (like sleep or read) in the PLT. The attacker finds the PLT in this step too.
  • Find the BROP gadget. The attacker can now control the first two arguments to calls.
  • Find strcmp in the PLT. The attacker can now control the first three arguments to calls.
  • Find write in the PLT. The attacker can now dump the entire binary to find more gadgets.
  • Build a shellcode and exploit the server.

By far, I’ve briefly introduced the principle of BROP attack, in the 2nd session, I would like to show how I conduct this attack in my setup environment.

Comments