Is Win32 A Debugging API? If Not, How Close Is It?

Thomas Ptacek | February 26th, 2007 | Filed Under: Reversing, Uncategorized

Assume a black-box pen-test with a Win32 target that has perfect debugger detection (disregarding how hard “perfect” is to achieve). Arbitrarily, assume no access to the kernel; in fact, no administrator privilege at all. We simply run in processes alongside the target with the same credentials.

How much control do we have over the target?

  • We can get a Win32 handle to the process with OpenProcess.

  • We can read process memory by virtual offset and length with ReadProcessMemory

  • We can enumerate the threads in the target with Toolhelp32

  • We can suspend or resume individual threads with OpenThread, SuspendThread, and ResumeThread

  • We can write process memory by virtual offset and length with WriteProcessMemory

  • We can allocate memory within the target with VirtualAlloc

  • We can change memory protection with VirtualProtectEx

  • We can enumerate modules and offsets within the target with Toolhelp32.

  • We can map out memory regions with VirtualQueryEx.

  • We can excute code in the context of the process with CreateRemoteThread (and RtlRemoteCall).

How much control would a debugger have given us?

  • We’d be able to suspend and resume threads, which we can do anyways.

  • We’d be able to read and write memory, which we can do anyways.

  • We’d be able to set breakpoints.

  • We’d be able to single-step the program.

  • We’d be able to read register contents.

  • We’d be able to call functions, which we can do anyways.

  • We’d be able to search memory for strings, which we can do anyways.

Without anything more interesting than the MSDN man pages, we come reasonably close to this without invoking the debug interface. In exchange for not showing up in NtQuerySystemInformation or mucking with the UEF, we give up easy-to-use breakpoints, single-stepping, and register access. But:

  • We can trivially get single-use, nonrecoverable breakpoints; just inject INT 3 into the text with WriteProcessMemory.

  • We can potentially get access to registers using NtSetThreadContext.

  • We can hook functions, in all the usual ways.

How close can we get to a fully-functional debugger without Windows knowing about it? I’ve been playing with this for a couple weeks, using Python and ctypes, turning IDLE into a debugger prompt. And it seems like the answer is “extremely close”. More later, but if this is totally obvious, pointers, links, and comments are welcome.

5 Comments so far

  • Skywing

    February 26th, 2007 5:58 pm

    This is what attaching to a process noninvasively in WinDbg/ntsd/cdb will do - debug without attaching as a debugger.

    WinDbg doesn’t give you any facilities for controlling execution of the target in this mode, however.

    The big problem will be coming up with a good way to trap exceptions without attaching as a debugger or injecting code into the target to hook the user mode exception dispatcher (I rule this out as anything that is reasonably effective at catching debuggers should of course try and detect things overwriting obvious targets like the exception dispatcher code; saying a target has perfect debugger detection and then nothing to try and detect hooks would be a fairly trivially bypassable anti-analysis mechanism altogether). You might try investigating the default unhandled exception port, which typically is connected to code in CSRSS for Win32 processes. Subverting the exception port for an already running process would typically involve injecting code into CSRSS though - so much for nonadmin.

    I am not sure what you really stand to gain by going through all this trouble, though. Most all of the techniques you described are `easily’ detectable by code in the target process itself; in fact, I would tend to argue that hiding a `noninvasive debugger’ that works by injecting code and the like to control breakpoints would be about as difficult (probably more so in fact) as hiding a `real’ debugger from the target (though you did say to ignore that fascet of this discussion…).

  • Thomas Ptacek

    February 26th, 2007 6:09 pm

    I concede that it’s hard to do breakpoints “more stealthily” by modding process memory than it is to simply use the Win32 debugging interface, with three caveats:

    - You have to know to look for text modification.

    - You have to solve the chicken/egg problem of writing code to verify itself, relying on an “untrustworthy CR3″.

    - While still cat-and-mouse, note that you don’t have to trigger “formal” exceptions to do breakpoints; it’s obviously not enough just to scan function pointer tables.

    I don’t agree with your argument that it’s not worth it, primarily because it’s not hard. That’s kind of the point: the Win32 API is extremely powerful even without its formal debugging interface.

  • Skywing

    February 26th, 2007 6:37 pm

    Still, the debugger interface gives you greater isolation between the target and the debugger than `hacking your own’ debugger interface, as far as control flow goes. This means that your adversary has only several, well-defined ways to detect you (the debugger), other than by trying to rely on the debugger leaking its presence by introducing side-effects on certain operations (such as handling breakpoint exceptions by default in many debuggers).

    All that the debugger API really provides you is a way to stop and examine execution at exceptions. There are ways for `custom events’, such as module loading, to be delivered to a debugger via a standard interface, but at its heart the debugger API is really just a way to give the debugger control over execution flow. Other than that, the rest of the APIs you mentioned, such as thread context manipulation or remote virtual address space manipulation would be used by a debugger that is operating with the debugger API directly depending on what the user is directing it to do.

    Certainly, a target must know how to look for modifications to its address space in order to detect a `noninvasive debugger’ that is doing `invasive’ things (e.g. attempting to modify program flow control) via injecting things into the target, but the same goes for detecting a debugger too, except in that case there are only a couple of ways to definitively isolate a debugger (aside from side effect observation), due to the fact that most of the magic gets hidden by the kernel (this again assumes that both processes are not admin).

    I suppose the point that I am trying to make is that, while not worthless, this approach is really just going to end up moving the problem around in the debugger/debuggee cat-and-mouse game. In other words, you could for instance construe this as a way to take advantage of the fact that software already out there today has stronger debugger detection than code patching detection, to the end result of bypassing the program’s protection schemes and successful observing and modifying its behavior.

    If you buy that, then I think that in the long run, once non-admin programs with anti-analysis features adapt, it will end up being better to just use the debugger API and working to hide the smaller subset of ways to detect a debugger than to detect code modifications.

    Of course, if any process in the mix has admin, then you’re open to a wide variety of other possiblities, but that’s outside the scope of this discussion per the restrictions you mentioned in your original posting.

    I will say that I believe that a mostly `passive’ `debugger’ that doesn’t inject code into the target will definitely be much harder to detect than a full-blown debugger using the kernel-provided debugging APIs. You might be able to get what you want with this approach, especially if things that the target is doing are relatively easily interceptable without invading the address space of the target (e.g. networking related calls whose results could theoretically be caught on the wire and modified in conjunction with a `passive’ debugger on-box, or local IPC calls where the remote endpoint could be suverted somehow by creating a named object before the intended target created a named object or the like. This would tend to be program-specific, however, with respect to its applicability.

    As far as truly undetectable debugging goes, though, personally I would root for virtualization instead of relying on OS-level debugger support, which is especially useful given that many interesting real world cases do involve programs that run with high privileges or that are drivers. If you’re working with a VM like Bochs that has integrated support for modifying the machine context, then that gives you a way to completely transparently inspect (and a much harder to detect way to modify, from the perspective of anti-debugging facilities in the target) the execution state of the target.

    Of course, that too may not always work; for instance, if a program is network-aware, then it might notice if the (virtual) machine was stopped for inspection in the debugger, even if the clocks on the machine were altered to make it appear as if no time has passed.

  • Thomas Ptacek

    February 26th, 2007 7:32 pm

    I agree with everything you said. Virtualization works well for userland debugging as well, although I’m cheating and doing “classical” breakpoints (writing an opcode that traps a VT exit), which are detectable the same way code patching is — Joanna has already virtualized the debug registers, though.

    I also agree with the subtext, which is that code-patching is more invasive than standard debugging. But code-patching is currently less detectable than conventional invasive debugging. It should be a PaiMei option!

    Under the scenario I’ve outlined (how about reading it as, “you got unprivileged user remote code exec on a remote target and you want to steal a secret from a program running alongside you on that target”), how do you get past NtQuerySystemInformation without code-patching?

    Another interesting question: you’re unprivileged, not admin; how do you reliably validate your program text, and all its dependencies (vtables, imports, trampolines, etc)? Hypothetically. You’re obviously not just storing a hash in memory. Are you incurring file IO and filesystem metadata queries every N milliseconds?

  • ivan

    February 27th, 2007 2:15 pm

    This is used in the test cases of the Vista Logo Certification:

    8. ThreadHijacker – Is a command line tool that injects crashes into another process by: pausing a thread, injecting binary data into that process’ address space, setting the thread’s instruction pointer to that binary data, and resuming the thread. This tool is supplied by Microsoft and is packaged with the Vista Logo Tools.

  • Leave a reply