This topic consists of 3 sessions:
- BROP Principle - dump memory to attacker and do exploit
- BROP Practice1 - attack conduct
- BROP Practice2 - code analysis
This paper is surely one of the best security papers I’ve ever read these years! Cool, fabulous, terrific, gorgeous, superb, sensuous…I would like to use any “good” word I can tell to describe it. And what’s more, the attack code is open sourced! I’ve read and talked about so many times about ROP attack, but actually I’ve never conducted or tried one in reality myself, but this time, I not only tried the “legendary” ROP attack, it is even the Blind one!
In this page, I will assume you have basic knowledge about Return Oriented Programming (ROP), so that I just explain what the BROP is and how they conduct it.
This BROP is really cool and smart, I hope I can throw light on it and make you clear.
Goal of BROP
- Remotely ROP attack one program even when we don’t know both the source code and binary of this program. The remote server may be protected by NX, ASLR, PIE and stack canaries, and the system may be 64-bit rather than 32-bit.
When you first see this, you must find it is notoriously hard to accomplish such goal. However, there’s also 2 assumptions:
- There’s at least one stack vulnerability, and the attacker has the knowledge of how to trigger it.
- Server process will respawns after crash, and the respawn process will not re-rand (which means rerandomize its address space). This is reasonable, it holds for examples like nginx, MySQL, Apache, OpenSSH, Samba, etc.
Remotely dump memory
Since we don’t know the memory layout of the victim program, the first thing we need to do is to find an approach to dump the memory to our screen, thus we need to call a
write system call with a socket file descriptor argument, that is:
write(int sock, void *buf, int len)
To tranform it to the assembly code, we can simply get following 4 pieces of gadgets:
So for this mission, what we need to do is finding these 4 gadgets, and constructing stack to provide appropriate values and invoke them sequentially.
Let’s now stop here, since currently we still have no idea about what the memory layout looks like, especially when the remote system are armed with ASLR and stack canaries.
Breaking Stack Canaries
How? One possible way is “Brute-forcing”, but that’s inefficient, and here the authors propose a so called “stack reading” approach.
Suppose this is the original layout of the compromised stack:
We can try as many times as possible to find the overflow length (until crash due to canary tamper, here is
4096+8=4104), then we fill the 4096 bytes with arbitary value, and leak canary by sequentially overwriting a single byte with value
x is correct, than the server will not crash, we repeat
x for all possible 256 byte values until it is found, till we find all 8 canary bytes, the process is shown as follows:
We can also use this approach to read the saved frame pointer and return address.
Finding the stop gadget
Till now we already have the appropriate canary to bypass the stack canary protection, the next task is to find the mentioned gadgets.
Before exploring for these specific gadgets, there is one type of gadget we need to get: stop gadget.
As we know, when we overwright the return address to some guessed memory address, most of the time the program will crash, e.g., the return address point to code with a null pointer dereference, then the connection will close. However, there’s another situation, where the return address point to code which causes an inﬁnite loop, so the program will just hang, and the connection will still stay open. This kind of gadget, the so called
stop gadget, is fundamental to
ﬁnding other gadgets.
Finding the potentially useful gadgets
Suppose we have a stop gadget that would cause the program to block, like an inﬁnite loop or a blocking system call (e.g.,
sleep), then how can we find other potentially useful gadgets? (here for useful, I mean the gadgets that have some potential functions, other than the ones which will cause crash.)
Since now we can only manipulate the stack, the only things we can do are overwriting return addresses. Suppose if we guess an address of a useful gadget like
pop rdi; ret, the application will still likely crash because it will eventually attempt to return to the next word on the stack, likely an invalid address, and this crash would cause us to discard the gadget classifying it as useless, as shown in the figure:
However, if we have
stop gadget as we mentioned above, it will be quite different. If we fill enough
stop gadgets after the return address we want to probe, as follows:
Then any crash gadget will cause program crash and be marked as useless, while any potential useful gadget will not cause crash. Nevertheless as you can see, there’s still a word called “potential”, which means there’s one exception, that is “stop gadget”. It is possible that the gadge we probe is also another
stop gadget, however it never matters, since we will check later to assure the gadgets that we need.
Dumping memory remotely
As of now, everything seemed ready, then let’s see how to find and validate the 4 pieces of gadgets as we proposed at the beginning:
As shown in the figure, to find the first 2 gadgets:
pop %rsi; ret,
pop %rdi; ret, we just need to find the so called
BROP gadget, which is very common as it restores all callee saved registers. Misaligned parses of it yield a
pop %rdi and
pop %rsi. Unfortunately
pop %rdx; ret gadgets are rare, so the optimization proposed by the authors is to ﬁnd a call to strcmp instead which sets
%rdx to the length of the string being compared, and together with finding
call write, we can find them in the program’s Procedure Linking Table (PLT).
- Finding BROP Gadget
Actually BROP gadgets are very special, since it sequentially
pop 6 values from stack and
ret. So if we use the approach with
stop gadget mentioned above, we can fill 6 crash addresses before the
stop gadget address:
if any potential useful gadget fullfil such condition, it is very likely one of the
- Finding PLT Entries
The PLT is a jump table at the beginning of the executable used for all external calls (e.g., libc). It has a very unique signature: each entry is 16 bytes apart (16 bytes aligned) and the slow path for each entry can be run at an offset of 6 bytes:
What’s more, most of the PLT entries will not cause a crash regardless of arguments because they are system calls that return EFAULT on invalid parameters. One can therefore ﬁnd the PLT with great conﬁdence if a couple of addresses 16 bytes apart do not cause a crash, and can verify that the same addresses plus six do not cause a crash. And these addresses are also the ﬁrst to have valid code as they are early on in the executable’s address space.
So when we find one of the PLT entries, what we need to do is check if it is
strcmp, the authors validate it by exercising it with different arguments and seeing how the function performs. Thanks to BROP gadget, we can control the first two arguments passed to
strcmp, and it will have following behavior and signature:
Based on this signature, we have high possibility to find the
strcmp PLT entry.
write, though there’s not any similar signature, we can still achieve it by scanning each PLT entry and forcing a write to the socket and checking whether the write occurred.
The only complication is ﬁguring out the ﬁle descriptor number for the socket. There are two approaches: chaining multiple writes each with different ﬁle descriptor numbers in a single ROP chain, or opening multiple connections and using a relatively high ﬁle descriptor number in hope that it will match one of the connections.
At this point the attacker can write the entire
.text segment from memory to the attacker’s socket, disassemble it, and ﬁnd more gadgets. The attacker can also dump the symbol table and ﬁnd useful functions in the PLT like dup2 and execve.
So the most challenge part has been solved, the subsequent process will be (simply from paper):
- Redirect the socket to standard input/output. The attacker can use dup2 or close, followed by either dup or fcntl(F_DUPFD). These are often in the PLT.
/bin/sh/in memory. An effective technique is to find a writable memory region like the environment,
environ, from the symbol table, and
/bin/shfrom the attacker’s socket to that address.
execvethe shell. If
execveis not in the PLT, the attacker will need to transfer more of the binary to find a
pop rax; retand
So in conclusion provided by the authors, the BROP attack is as follows:
- Find where the executable is loaded. Either 0x400000 for non-PIE executables (default) or stack read a saved return address.
- Find a stop gadget. This is typically a blocking system call (like sleep or read) in the PLT. The attacker ﬁnds the PLT in this step too.
- Find the BROP gadget. The attacker can now control the ﬁrst two arguments to calls.
- Find strcmp in the PLT. The attacker can now control the ﬁrst three arguments to calls.
- Find write in the PLT. The attacker can now dump the entire binary to ﬁnd more gadgets.
- Build a shellcode and exploit the server.
By far, I’ve briefly introduced the principle of BROP attack, in the 2nd session, I would like to show how I conduct this attack in my setup environment.