Anti-debugging: 1.11 You Are (not) Breakable

February 09, 2018 0 Comments

Anti-debugging: 1.11 You Are (not) Breakable

 

 


Disclaimer: This is just for fun, don't use this knowledge for evil!
It's also not infallible, there is no way to secure your application against reverse engineering 100%, unless you never release it.

To preface this, it's not required that you know about things like stack frames, object lifetime, and other such ABI internals, those will come into play later down the line, but for now, it's not needed.

So, a while ago for my (at the time) local DEF CON chapter, DC480, I did a quick presentation on some anti-debugging mechanisms and techniques that one can use to better secure their totally legitimate software against debugging and general reverse engineering. So, here I decided to transcribe that into a series of posts, and to start it off, why don't we try to kick the debugger a bit?

To start off, I'm going to discuss some important information you will need in order to fully grasp the mechanics of what is going on. This starts with the layout of executables, and then covers a small bit on how debuggers work.

Executable Layout

When you compile a program to an executable, the compiler takes the code you write and translates it into a machine code object of some kind. After all the objects have been produced, the linker gets fiddly with the produced object files and packages them all up correctly in the platforms native executable format.

For this instance, we will be using ELF64, but is applicable for other formats. An executable is more than just a conglomerate of object files in a special order, it has a specific layout that it must follow.

As a general rule of thumb, any executable will have two main parts: the header and the sections. The first is rather self-explanatory: it is the first n-bytes of the executable that dictate things such as platform and other execution information. This is followed by various sections, containing the code segment (also known as the .text section) as well as the data segment and any other sections that the linker decides to throw inside.

The ELF spec is quite dense, and it has a lot of moving parts1, so here's a visual aid.

ELFMini

Simple ELF layout, courtesy of Ange Albertini.

One thing that all executables have in common are sections. These partition the code into logical blocks that help maintain structure and allow for things like the operating system to make sense of them.

There are many many sections, but there are two main ones we care about at the moment, .text and .data. The first, .text, is where the code for the program is. The second, .data is where all the data such as variable values and such are located.

Some of the caveats of this are that you can only execute code that is in the .text section. For instance, DEP, or Data Execution Prevention (Sound familiar?), is a mechanism that prevents the execution of anything located in the data sections of an executable, meaning that you can't have "hidden" code in the data section that can be run.

Debugger Internals

My favorite answer to how debuggers work is "magic", mainly because even if you have a decent idea of how programming works and how to write code, you may have never even fully used the capabilities of your platform's debugger or know how to use it at all (I saw the latter case much too often in university).

So, here is a basic overview of how the debugger works:

Most modern operating systems have system calls that allow the facilitation of debugging child processes, this allows for all kinds of fun. But, before we focus on the system calls, let's get one thing out of the way: a debugger (at least the kind we are talking about here) is not like a virtual machine. It doesn't load the child process and interpret the instructions in the .text segment. That's not to say all debuggers don't operate that way -- valgrind2 is a memory debugger and it works just like that, but it also slows down the process execution speed by several orders of magnitude.

In the case of Linux, any parent process has a large amount of control over child processes; in this case there is a system call that allows for the parent process to observe and control the execution of its child process, and it's called ptrace(2)3. Another note on Linux is that processes can "adopt" other processes, allowing them to be attached to and ptrace'd, but that's another topic.

Ignoring the platform specific items, the naive look at how a debugger works can be described as follows. First, you load the executable or attach to the process. Next, you load the symbols from the executable if you can. Then, if possible, we can try to suspend execution of the process, this prevents the operating systems task scheduler to advance the program execution. At that point, we are free to read and modify memory, add breakpoints, and the like.

For the most part, that naive view is enough to get us through what needs to be done. However, one thing that needs some more explanation are breakpoints and how they are not part of ptrace.

A breakpoint is basically a dedicated system interrupt (0xCC or known mnemonically as INT 3) that causes the operating system to act in a certain way. In the case of Linux, it causes a SIGTRAP to be thrown, this signals the debugger that the child process has reached a point that we are interested in. As such execution is halted and the debugger takes control.

Setting a breakpoint is actually relatively straightforward. First, we read the instructions at the address we want to set the breakpoint at. We then save that so that we can put it back later. After we save the instruction, we then replace it with the 0xCC instruction. And when the breakpoint needs to be removed all we do is put it back, easy right?

Something to note is that INT 3 actually has two valid opcodes. The first being 0xCC as mention prior, however, the second follows the normal interrupt opcode scheme of 0xCD imm8, so INT 3 can be assembled to 0xCD03; however, the two byte opcode does not trigger a SIGTRAP and any conforming assembler will never generate a 0xCD03 from the INT 3 mnemonic.

As an aside, apparently some debuggers will use a random invalid instruction to trigger a SIGILL, but from what I've seen, GDB and LLDB will use 0xCC as the opcode. If you want to know a bit more what the INT 3 instruction does (at least on Intel), section 6.4.4 and Volume 2A page 3-428 in the Intel Architecture Manual4 are a good start.

Now that we know those things, we can get on to the good stuff~

Breakpoint Detection

The first basic anti-debugging trick we can do is basic breakpoint detection. Now that we know how a breakpoint is encoded we can start looking for them! Because the breakpoint is encoded as a single byte, all we need to do is look for that byte. Now, note: this will only work for any breakpoints set prior to execution, seeing as we look for them prior to running the rest of our code.

First things first, we need to get a handle on the whole .text section. Lucky for us, there are two symbols used by GCC and the linker that let us do just that: start and etext. start is the start of our code and etext is the end of the .text section. So, all that we need to do is iterate the addresses from start to _etext looking for 0xCC.

Here is a quick little method for detecting breakpoints:

/! \file break-detect.c
 Rudimentary breakpoint detection
/
#include <stdio.h>
#include <stdlib.h>
extern unsigned char _start;
extern unsigned char etext; int DetectBreakpoints() { int count = 0; char start = (char)&start; char* end = (char*)&etext; printf("start @ %p\n", start); printf("_etext @ %p\n", end); while(start != end) { if((*(volatile unsigned*)start) & 0xFF) == 0xCC) { ++count; printf("Breakpoint at %p: (%x)\n", start, *start); } ++start; } return count;
} int main(int argc, char** argv) { DetectBreakpoints(); printf("BREAK ME!\n"); return 0;
}

This code is a touch dense, so I'll walk you through it.

The start symbol is defined as the entry point to your application. Now, you may think that it is main, but there is work that still needs to be done when you start up the application; that's the job of crt0 (sometimes just c0). It is the bootstrap and initialization code that helps get your application up and running, and that is where the symbol _start is defined. In the case of GCC however, this is spread across several files; crti.o and crtn.o. A touch more info on this can be found here.

As for the next symbol, that is defined in the linker script. On my version of GCC and LD (5.3.0 and 2.26.0 respectively), the symbol is defined on lines 70 to 72, or within the context of this snippet, the last three lines:

 .text : { (.text.unlikely .text.unlikely .text.unlikely.*) *(.text.exit .text.exit.*) *(.text.startup .text.startup.*) *(.text.hot .text.hot.*) *(.text .stub .text.* .gnu.linkonce.t.*) /* .gnu.warning sections are handled specially by elf32.em. */ *(.gnu.warning) } .fini : { KEEP (*(SORTNONE(.fini))) } PROVIDE (etext = .); PROVIDE (etext = .); PROVIDE (etext = .);

You'll notice that they are defined right after the .fini section marker (for more information on .fini see section 17.20.5 of the GCC documentation). LD lets you provide section markers at arbitrary points of the executable. That's what the PROVIDE() statement is for: it allows you to define a symbol at an address or location within the executable, where . is the current position in the executable layout. As you can see, there are three symbols defined: etext, etext, and etext. There is no difference between any of these symbols but the name, so they all point to the same location in the executable.

The next lines that I should explain are the following:

char start = (char)&start;
char* end = (char*)&etext;

It may seem strange to get the address from a pointer in the first place, and casting it to a character pointer, rather than a uintptrt or void pointer, but the reason why we do that is so that we can iterate each byte of the executable, seeing as a char is a single byte. Another thing that casting to a character pointer lets us do is increment the address easily, so we can just keep looking without any complicated nonsense.

The last line that needs explaining is the following:

if(((volatile unsigned)start) & 0xFF) == 0xCC) {

Here we are casting start to a volatile unsigned pointer and then dereferencing that, allowing us to get the raw word at that address and checking to see if it matches 0xCC.

And here is the result:

(/tmp) Misaka λ gdb ./bp ...
Reading symbols from ./bp...done.
gdb λ r
Starting program: /tmp/bp
start @ 0x555555554590etext @ 0x5555555547ed
Breakpoint at 0x5555555546fb: (ffffffcc)
Breakpoint at 0x555555554757: (ffffffcc)
BREAK ME!
[Inferior 1 (process 3562) exited normally]
gdb λ tbreak 29
Temporary breakpoint 1 at 0x555555554754: file bp.c, line 29.
gdb λ r
Starting program: /tmp/bp
_start @ 0x555555554590
_etext @ 0x5555555547ed
Breakpoint at 0x5555555546fb: (ffffffcc)
Breakpoint at 0x555555554754: (ffffffcc)
Breakpoint at 0x555555554757: (ffffffcc) Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe808) at bp.c:29
29 printf("BREAK ME!\n");

While this can detect breakpoints set by GDB, it also sees anything that has a 0xCC in it. Therefore in our example we have two false positives, I'll leave identifying where they come from as an exercise for the user.

Now that we have this power, what can we do with it? Well, how about we remove the offending instructions? However, that will need to wait for part 2, as this has gotten long enough. In the meantime, happy hacking!


Tag cloud