Archive for the ‘Reversing’ Category

Retsaot is Toaster, Reversed: Quick ‘n Dirty Firmware Reversing

Eric Monti | April 29th, 2008 | Filed Under: Reversing

I recently worked on a project that involved embedded systems and reverse engineering. This sort of territory can be a little hairy the first few times out. I ran into some interesting challenges and discoveries along the way which I thought might be worth writing a little bit about. I can’t tell you what the target was. But, it was important. And, we beat the crap out of it. So instead, I’ll tell you what I wish it was: a networked 4-slot toaster.

Now… to make things interesting; Early on, I’d discovered a vulnerability in the toaster that allowed any attacker to load their own firmware on the device. Ouch! My toast! My beautiful toast!

In order to drive home the risk (mostly to the vendor) of the firmware loading vulnerability, I was asked by my customer (also the vendor’s customer) to demonstrate the attack by actually loading malicious firmware onto the device and getting it to run.

Mind you, the request to prove this is actually pretty sane. I had little knowledge of the boot loader, or even of the firmware image format. I couldn’t say for sure that there wasn’t a code-signing feature, which would prevent the toaster from loading any image that wasn’t cryptographically signed by the vendor. That would have rendered the firmware loading attack impotent. To make things worse, the vendor was being pretty light on details. Can’t say I blame them.

So… I was tasked with demonstrating that the vendor’s firmware:

  1. Could be reverse engineered
  2. Modified
  3. Loaded back onto the device
  4. By an attacker

Think “embedded rootkit”. Embedded rootkits are sort of a holy grail: it’s easy to load a rootkit on a PC, but tricky to get them onto a wireless router or switch (or toaster). The reward for doing it, though, is that nobody ever thinks to check their router or toaster for rootkits. Initially, my bar was just going to be to demonstrate I’d actually changed some aspect of the program with my own firmware patch and move on to more penetration testing.

.

Some embedded projects are easier than others. Sometimes you get lucky. You find a debug shell, a file-system, or a serial console. A lot of times, devices that look like black boxes have JTAG ports, to which you can attach a debugger. Devices built in the last 10 years tend to have GDB stubs, so you can target them with cross-debuggers. Some even have firewire, through which you can DMA in and out of system memory.

This wasn’t any of those. Without reversing the firmware, I had no way to execute code on the device. For the purposes of penetration testing, I really couldn’t even see what was actually happening on my target. I hate that. All I had to work with was the firmware image file itself, offline, and some observable external behaviors of the device.

So… if I was going to make any of my objectives happen, I was basically starting purely from static analysis. Back in 2005, Tom wrote about a similar set of steps he took given similar starting point. Without realizing it this time out, I actually followed them almost to the letter. There were a few points where our paths diverged slightly because of differences in our target and approaches.

.

This is my story. Details have been fuzzed to protect the guilty toaster.

1. In Which I Get The Image

I start out with a couple megs worth of firmware image file. This file is sent to the device when you do a firmware upgrade the official way. You can usually find firmware images on the vendor’s download site, or on the CDs that came in the box.

2. In Which I Scan The Image

I feed it to deezee. Deezee is part of Matasano’s homegrown “Black Bag” toolkit. What deezee does is search through a binary file for compressed data by looking for zlib signatures, and then extracting anything it finds. Its crazy how often this seems to work on unknown binary files. As it turns out, zlib is the industry standard compression format, even for toasters.

Sure enough, deezee finds not one, but 3 distinct blobs of compressed data packed in the file. Hmm… really… deezee should tell me where it finds these things I think. Lets fix that now. Much better.

Edit: Here’s the link to blackbag with patch applied.

3. In Which I Read

With addresses for compressed chunks in hand, I open up the file in my hex editor.

Deezee found the first compressed segment around 384 bytes from the beginning of the file. Here’s where technique comes into the picture.

Deezee found three “hits” in the image. I assume they aren’t just sprinkled haphazardly throughout the image; there should be some kind of header on them. I want to know what that header looks like. Maybe it’s used for things besides compressed blobs? The way I’m going to do that is, I’m going to compare the bytes surrounding the compressed blobs for all three hits.

So, comparing the preceding 384 bytes of that and the other two hits, I see several similarities:

  • A 4-byte signature. I can tell because it’s always the same four bytes, in the same place relative to the blob.
  • An ASCIIZ string, apparently describing the firmware version, padded to 128 bytes. You see “padded strings” all the time in binaries; what they are is a C-struct with a “char version[128]”.
  • A 16-byte ASCIIZ string of numbers describing just the version, again padded with NULs.
  • 4 bytes which, when unpacked as an integer, happens to match the size of the compressed chunk. (If your hex editor won’t tell you the big and little endian 32 bit value of four highlighted bytes, get a new hex editor). This is the trick that cracks most image formats: you look for 16 or 32 bit fields that look like lengths, and try to reconcile them to the file and its features.
  • 4 bytes which… hmm… could be CRC32 checksum? This is a bit of a shot in the dark, but it’s easy enough to check:
$ cat app.bin | ruby -e \
'require "zlib"; puts Zlib.crc32( STDIN.read ).to_s(16)'
a76ea2ad

And… they match! I’ll need to keep this checksum in mind when I try modifying the file. Bodes well for no code signing!

Why did I try a CRC32 checksum first? Well… first off I assumed the field was 32 bits long. And CRC’s (Cyclic Redundancy Check) are used a lot in file formats that encapsulate other files. As you can see above, I used Zlib’s crc32 function to check my file chunk. Could just easily have been OpenSSL’s since it too is a standard Ruby library. Two shining examples of how common CRC32 is.

4. In Which Other Headers Are Found

Now that I know what to look for, I find out that this header also prefixes some other chunks. Ones that aren’t compressed and so weren’t noticed by deezee. I also confirm that CRC32 header field for each chunk and it’s used on the other’s too. Five total, including the compressed ones. The last one is a big chunk of base64 which looks like it might be an encryption signature. Hmm… might be used for code signing somehow? Looks like I still need to confirm or discount this possibility.

5. In Which I Check For Metadata

Go back and look again at the original chunks of compressed data. The zlib headers in the file included the original filenames which give me some clues as to what each is. Going on filenames, I’m thinking I’ve got phase one and two boot-loaders and the third chunk is the actual app. I’m interested in the application. The Unix binutils ‘file’ command doesn’t tell me anything useful about any of them, but it’s always worth a shot.

6. Behold: Low Hanging Fruit

Strings the application file using GNU strings with ‘-t x’ so I get nice hex offsets for the where the strings are found. Lots of interesting stuff:

  • Uh… well there’s this: Nice ASCII art. Just a wild guess, but I think maybe this is VxWorks?
  • The regular string ‘vxworks’ shows up in a lot more places too.
  • “GNU ld version 2.9-mips3264-010729”. Ok so it’s MIPS. This is also good news. It means that somewhere out there, there’s probably a GNU binutils distribution that can understand this file format.
  • Lots of what looks like function names close together. When do function names occur in compiled output? Two reasons: debugging code, or a symbol table. A symbol table would be a score; I make a note of that for later.
  • Lots of assertion messages with references to source code line numbers xxx.c(###). Always handy. Even if you don’t have symbols for functions, with an hour of data entry in your disassembler you can usually fake them based on debug and assertion strings. You want to take the time to do this. By the time you’ve guessed even 10% of the symbols in an image, you’re usually going to be able to comprehend 90% of what the image does, without reading any assembly.
  • The strings you see when the device boots up.
  • Lots of error, exception, and status messages such and one typically finds in a program. Sometimes very handy as we’ll soon see.

7. In Which Firmware Is Patched

I’ve got a pretty good idea of what the program is and how it’s rolled up in the firmware image file.

My job right now is to figure out whether I can change the firmware and load it onto the toaster. So I make a simple change to it: there’s a string which gets displayed when the system boots up. Using my hex editor, I change it to something noticeably different, being careful to keep it the same size as the original. Then I re-compress it at the same level as the original, re-roll it into the header format, and change the CRC32 checksum.

I load it onto my target using the bug I found… and… sweet satisfaction! If there’s any code signing here, it doesn’t work. That spooky looking signature at the end is for something else. Knowing this, It’s well worth the effort to keep reversing the image.

8. In Which A Loader Offset Is Sought

Now, the binutils ‘file’ command didnt recognize the compressed image, nor did it recognize either of the apparent “boot-loaders”. If it was a well-known format like ELF or COFF, it would have.

Unfortunately, the executable headers are what tell us how to load the whole program into memory. Without a known executable format, I’ll need to find out the load offsets myself. If I don’t, when I load the program into a disassembler, none of the data and instruction offsets will make sense. You can read a binary image with broken offsets, but it’s not fun.

Headers are usually located at the beginning of the file. But I’m not seeing any strong indications of any sort of header information at all in this one. My guess: an older version of VxWorks, predating VxWorks ELF. On the bright side, firmware images aren’t usually relocatable. Unlike a Windows PE program, which is literally edited and patched up by the linker at runtime, firmware tends to be loaded at a specific address.

I could try reversing either or both of the bootloaders to find that address, but… I might just run into the same problem walking all the way back up to the first boot-loader. To be honest, I’m not really interested in unlocking the secrets of the toaster boot loader. If I was trying to jailbreak the toaster to heat unauthorized off-brand Pop Tarts, maybe I would be. Exercise for the reader.

9. In Which I Infer A Loader Offset

I’ve got another idea. Lets look closer at some of these strings and their surrounding data. I scroll around in my hex editor around the sections with strings until I’ve found what I’m looking for.

What we’re looking for is a list of strings preceded by their address table. At this point, I really don’t care what the strings are, just that they are recognizable text. The addresses prefixing the strings in the table are 32-bit big endian numbers pointing to the ASCIIZ strings. Only parts of the addresses match up to real addresses in the file, though, and this is key.The first entry (blue) points to 0x104DEDD8. The string actually lives at 0x004D9DD8 in the file. Subtract the two and you get 0x10005000 - the load offset for this segment.

If I’m unsure of my assumptions here, I check some of the other entries in the table (red, green, and brown). Same pattern holds. We’ve got a winner.

10. In Which I Sanity Check The Offset

Search for 0x10005000 and 0x5000 near the beginning of the file. I want to confirm my assumptions about the lack of a header. Nothing comes up. I’m still very certain this is my winner. The lack of a search hit here suggests again that the loader decides where to load everything, though this program’s compiler knew where that would be when it was compiled too.

11. In Which I Dissassemble

We are now at step 11 and it is time to load the file in a disassembler.

We use IDA Pro. IDA wants to know the architecture of the file. We choose a processor type of “mipsb”; that’s MIPS, which we learned from scanning the image format for strings, and running in big-endian mode.

How do we know it’s big endian? Big endian is a safe guess for non-X86 architectures, but in this case it’s more than a guess: the addresses and lengths in the file were big endian. What if we hadn’t seen a string identifying it as MIPS? I’d probably take a fragment of the file and feed it through “objdump -d” multiple times, specifying MIPS, PowerPC, ARM, and SPARC, in that order. You know you have a match when the instructions sort of make sense.

I let it load as one big segment. Once loaded in IDA, I go to “Edit -> Segments -> Rebase Segment” and enter the address I came up with in step #9. Rebasing tells the disassembler what the load offset is.

  • Here I have to tell you about Thomas’ irrational fear of IDA’s rebasing. According to Tom this is based on a bad experience rebasing a 10 megabyte firmware image, waiting a day for it to finish, and only then noticing he got the address wrong. Here’s a handy trick Thomas uses instead of rebasing: feed the file to binutils “objcopy”. Objcopy is a best-kept-secret for reversing: you can feed it a file of type “binary”, and tell it to spit out an ELF file, specifying the header values. Several other tools suck at handling raw files, but rock at handling ELF. Why is his fear irrational? Well… I just disable IDA’s auto-analyze feature before I rebase. Then I do some manual poking around first to make sure I’m on the right track before turning it back on again. But… nobody tell Thomas this! He always comes up with crazy cool tricks using other tools when he gets pissed off at IDA.

12. In Which We Work Out A Symbol Table

Based on where I found strings in the image, I have a pretty good idea of the general region of memory where program data lives. Remember those function names I filed for later? Now is later.

I take the addresses of a few function name strings I find close together and search the file for the four bytes making up their (newly rebased) addresses. I’m consistently getting search hits near the end of my probable data segment, all in the same general area. Look around there and start converting every 4-byte aligned chunk starting with 0x104 to an offset. I start seeing offsets to strings of what looks like function names at 16-byte intervals. The string offsets are right next to offsets pointing way back into the code area and some pointing into the data area. I look a little closer at the hex dump of this area:

Blue is what points to the “name” of the symbol, red points to the actual thing in memory. The last column is the “type” of thing. Green things (0x500) are functions, and purple (0x700) are data. There’s also some oddballs strewn about with a type of 0x900. These point way out of bounds past the end of my rebased file. Could be a segment I don’t know about from this file, or it could just be something else created at run-time. I don’t stress about this, stick with what I know for sure right now.

I take this pattern and translate it into a structure for IDA, then find where the table begins and ends based on the pattern. This is an array, so that’s how I define it in IDA. The last element of the array is 16 bytes of 0x00 helping me (and IDA) see where it ends.

13. In Which I Import The Table Into IDA

Things start getting recognized quickly by IDA’s auto-analysis once I tell it about all these symbol offsets. But I also want IDA to know the names of everything in the symbol table.

I write a quick Ruby script to write an IDC script to do this work for me. IDC is the scripting language built into IDA; in the 21st century, most people write their IDA tools in Python or Ruby, but it’s sometimes faster to write short scripts that use IDC as a half-way format.Run ruby… run IDC… take a look at the results. Awesome! Just about everything was identified in this symbol table! Furthermore, everything was identified in the same segment off of one load address.

Just to make sure about that missing header I check my final results from the symbol table. Indeed, two of the function symbols point right at 0x10005000, the beginning of the program. One’s called ‘sysInit’ and the other ‘start’. There can’t be a header there, it’s a function. Though I can tell as much from the name “start”, a little googling and research tells me “sysInit” is what VxWorks usually uses as its program entry point.

.

Besides proving the firmware loading risk, I had what was shaping up as a target rich source of vulnerabilities on my hands. I really wanted a look at the code running on the device for the purposes of additional vulnerability testing. Now that I had readable, cross-referenced disassembly, I had a lot more to work with for finding and confirming other vulnerabilities both already found, or that I might find as I moved forward.

As Tom put it, “hilarity ensued”. Truly, vulnerability research is just as good in ‘08 as it was in ‘05. Unfortunately, I can’t talk specific details from there, but I wanted to share with our readers the road I took to this point.

Getting a full reversing of an embedded system like this is pretty satisfying and can open up lots of possibilities. The DD-WRT, OpenWRT, and NLSU2-Linux Unslung crowds have gone nuts with this sort of thing with pretty impressive results putting Linux distros on just about anything they can get their hands on.

I’d love to hear about some of your own embedded system reversing experiences.

Comment Bubble 29 Comments

Exploring Protocols 2: Writing some tools

Eric Monti | October 28th, 2007 | Filed Under: Development, Reversing

In this much delayed installment I’d like to expand on my last one entitled “Exploring Protocols 1″. This is going to be a long one, folks. I guess the big delay in getting this out resulted in a backlog of all the things I wanted to cover. The discussion veers into tools and samples some simple code for dissecting unfamiliar PDUs. There’s more to the “protocol tool” category than just dissecting, of course. But it’s usually the first step and this post will try to focus mostly on it.

The protocol exploration and tools journey often starts out something like this:

1. Got an unknown or little known protocol we’d like to learn more about and maybe mess with.

2. Think: “Hey, self… Wireshark has lots of decodes. Lets check it out.” Then one of two things often happens:

  • Turns out it is a protocol wireshark DOES decode. “Nice! But what if we want to actively or passively mess with it? Wireshark is a sniffer…”

or…

  • Turns out this is a protocol wireshark doesn’t know about. But it does seem to have a framework for building new ones? Shouldn’t be too hard to just plug in a new decode… should it?

Let me be clear before going any further with this: Wireshark is fantastic! I use it ALL the time. I don’t have to tell anyone who’s ever used it how long and impressive it’s list of features and dissectors is. It’s gotten a great reputation and it’s well deserved.
But I’ve not had much luck using Wireshark’s framework to add new dissectors or make use of existing ones in other programs. The dissectors Wireshark supports are written in C and compiled in at build time [** see note] Sure it uses pcap and there’s numerous pcap based tools for packet mangling/manipulation. That might warrant the effort, but WireShark is first and foremost a sniffer, and we need to do a lot more than sniff protocols. Its dissectors aren’t yet as easy as on might like to make available in other programs. Every time I’ve looked into it I end up deciding to just end up writing my own tools from scratch. The reasons will probably become evident as I go on.
** Note: this may be in the process of changing. The Wireshark folks are beginning to implement a LUA interpreter right into the sniffer. LUA is an embedded programming language. It’s cool stuff, but so far it’s still pretty new in Wireshark and it’s still just intended as a tool for developing C dissectors.

But anyway… back to my numbers:

3. What’s needed a more flexible framework. “This is not the wheel I am looking for. Reinvent!!!”
Ok so bear with me. I’m going to jump ahead to the tools and talk some about the techniques I’ve used to attack the problem above in the real world. I’ll digress a bit and yammer seemingly pointlessly about a few iSCSI details too.

Yes, we’ll need a protocol to talk about. For simplicity (… ok well, “reference” anyway) we’re going to stick with iSCSI from the first installment. Wireshark actually dissects iSCSI quite completely. This may defeat the point of the discussion about exploring unknown protocols for some, but for now it’s still a good thing since we can actually compare what Wireshark can do with what we build with our own toolchain.

Let’s start with a simple iSCSI PDU hexdump for reference:

iSCSI pdu sample hexdump
Our initial goal was to figure out a bit about the format. I talked about that in the last post. Check out the last “Exploring Protocols 1“. Now we’ll try to dissect iSCSI PDU’s in a manner similar to Wireshark.

In the not too distant past, I would have probably used C/C++ for this. This is mostly because C/C++ has ’structs’ and pointers. The ’struct’ lets you describe binary data to your program by defining fields as elements a structure. Using the structure as the type for a pointer variable lets you apply the structure to a buffer in memory like an overlay. This is pretty obvious stuff to anybody who does any C coding at all, but not everybody does. This simple feature is really powerful and efficient. I still get a few goosebumps when my structures “lay over” properly.

Here is an example of a trivial iSCSI_dissector in C. This sample program is not too robust, but serves as a good starting point. I used a structure borrowed and adapted from linux-iscsi-3.6.3. You should be able to compile this program with “gcc -o iscsi_dissect iscsi_dissect.c” on any *ix and maybe windows. First lets just look at the structure defined for the header:
iSCSI C structure
The structure for the header defines the fields and sizes based on what the iSCSI RFC calls for. The size of the datatype (uint*_t) defines the width of the field.

There are some funky field alignment issues that the linux-iscsi header structure doesn’t try to fully address in the BHS structure definition, though.

This is just nitpicking (bitpicking?) and may distract a little from protocol dissection, but lets take a quick look again at the first four bytes from the RFC 3720 syntax description. This time, the structure elements from linux-iscsi are lined up with each field from the RFC description.

BHS_with_vars.jpg

Neither the ‘opcode’ nor ‘flags’ elements in the structure are supposed to take up an entire byte in the BHS if you read the RFC “literally”.

The Basic Header Segment as defined in the RFC has a null bit, followed by an ‘I’ bit, a 6-bit opcode, and a single ‘F’ (final) bit followed by 7 bits allocated as “opcode specific”, or as linux-iscsi called it, “reserved”.

10.2.1.1  I

For request PDUs, the I bit set to 1 is an immediate delivery
marker.

10.2.1.2.  Opcode

The Opcode indicates the type of iSCSI PDU the header
encapsulates.
...

So the first two bits should always be “0 0″ or “0 1″. Why the first bit is not used is unclear, but in obvservation of iSCSI it seems like it’s always set to ‘0′.

So what about that 6-bit opcode after ‘F’? Well if you look closely in the spec, you wont find any op-code higher than the number ‘63′ (or 0×3f in hex). Why? Because at least 7 bits are needed to store for the number 64 (or 0×40 hex) and higher. So there can never be more than 63 types of opcode represented in the “Opcode” field alone. Like so:

63 is "0 0 1 1 1 1 1 1" in bits
64 is "0 1 0 0 0 0 0 0"

There’s another similar (but different) situation right after this one:

10.2.1.3.  Final (F) bit

When set to 1 it indicates the final (or only) PDU of a
sequence.

10.2.1.4.  Opcode-specific Fields

These fields have different meanings for different opcode
types.

The basic header segment is just a template for all the types of header segments. They’re all defined in the RFC too. If you read up on them, it turns in several cases the remaining 7 bits in the the ‘[F][rsvd1]’ byte for are used for other flags. In some cases it’s really “reserved”. In other cases it’s used for codes that are significant to the specific segment type. Sometimes the meaning of the ‘F’ bit even gets changed.

It’s a common practice not to define every single bit in a field separately and just combine them in one byte. When using the structure, you’d just use bitwise operations against various values to set, check and clear individual bits.

Protocols often try to squeeze a use out of every single bit. This can give rise to occasional situations where a weird alignment issue like the ‘rsvd1′ or ‘opcode’ ones can result in unexpected behaviors. Depending on the implementation and what assumptions the developer makes, assigning values to anything after the ‘I’ or ‘F’ might inadvertently get set or cleared. This type of thing can might have dangerous side effects.

Queue haunted house organ music...

Ok… don’t know and can’t think of any even remotely dangerous iSCSI opcode or rsvd1 side effects. I and F seem pretty benign to me just from reading the RFC and theres probably no case short of a grotesquely bad implementation where they’d get clobbered by their neighboring values. The spooky music is… um… just to get in the Halloween spirit. Yea! Happy Halloween everybody!

So, here’s what the output from iscsi_dissect.c looks like (cleaned up slightly for display):

Dissecting iSCSI frame: (Actual PDU Length = 136 bytes)
Format is big endian so the hex order matches wire.
Opcode:        = 36 : (0x24)
Flags:         = 128 : (0x80)
Op-Specific1   = 0 : (0x00)
Op-Specific2   = 0 : (0x00)
TotalAHSLen:   = 0 : (0x00)
DataSegmentLen = 85 : (0x000055)
LUN:      hex: = 0000000000000000
InitTaskTag:   = 1 : (0x00000001)
Opcode Specific Fields in hex:
ff ff ff ff 00 00 00 02 00 00 00 02 00 00 00 03 00
00 00 00 00 00 00 00 00 00 00 00

Data segment strings: (segment length: 88 bytes):
TargetName=iqn.2009-08.com.splatomatic:storage.lvmx00
TargetAddress=192.168.1.10:3260,1x00
x00
x00
x00

The struct -> buffer overlay used in the C program is simple and kind-of elegant. Structures are also nice since they can serve as a short-hand for describing the format of messages we’re trying to figure out.

This version also resembles the way most actual iSCSI implementations probably work. For grey/black-box security testing, this can be an advantage. You may gain insights that will help you find where implementation flaws lurk in the real product. Using the same language that as actual implementation can arguably let you get closer to the same thought processes the developers used. If there was a point to that random ranting about partial use of bit-fields within bytes, that was it. The rant should also serve as a case study in following useless random rabbit-holes too early in the exploration phase. Which is another good lesson on exploring protocols and would make a good blog-post by itself. I’m sure most people have already dropped out of this overly long post by now. Too bad for them, they’ll totally miss out on the best part.

Ok… for some, the problem with C is that, well, it can be too rigid. I personally think that higher level interpreted languages are better for doing exploratory work like this. As you’re examining unfamiliar and possibly undocumented protocols or formats, you’re constantly refining your idea of what your structures should look like and what you want to do with them. You don’t want to have to stop and recompile for every change. You *really* dont want to find yourself having to redesign your tools several times over.

One of the frustrating things about writing quick/dirty tools from scratch for fault testing or fuzzing is first getting rid of faults in your own testing tools. Usually the quicker and dirtier your C is, the harder it becomes to work with. There are lots of programmers out there that this is may not be a problem for. They probably stopped reading a while ago too. But even they end up relying on abstraction to some extent.

My experience has been that using higher level languages gives you more freedom to experiment and you get a lot of relatively stable free abstraction already without having to build it yourself. To put it bluntly, you can get away with more, for longer, usually with less code. On the other hand, all that abstraction can get in your way sometimes when you need to get closer to the wire, but this isn’t as big of a deal as people make it out to be.

So lets see how we could have written a similar dissector/decoder in an interpreted language. I’m going to use Ruby. Ruby is nice. It has some particularly useful qualities for protocol exploration (besides just being “nice”):

  • It’s an interpreted scripting language. We can make modifications and tweaks on the fly much more easily. No recompiling necessary.
  • Ruby code is very portable. Very few problems getting your code to work on different platforms.
  • It’s got built in large numbers, a string data-type that doubles as a raw buffer, and lots of other free and generally well designed classes methods for for doing all sorts of commonly useful things. All in all this makes for fewer things to screw up as you’re cobbling something together quickly (as I tend to need to).
  • Modular design is, I think, actually harder *not* to do with Ruby. Keeping things modular when developing exploratory tools will mean that exploration phase actually yields highly useable code that we can adapt very easily down the road.
  • Did I mention Ruby is nice?

One thing ruby doesn’t really do without add-ons is ’structures’. The way Ruby uses references to objects is similar to pointers. Actually, Ruby’s references definitely harken back to C/C++, the language it’s written in. But the similarities aren’t completely linear. It’s also a moot point because what really helps protocol design is ’structs’.

“But somebody must have done something about this by now!”, you say? Actually several have. One such thing is called BitStruct. I’ve been using it a lot. Not only does BitStruct let you map out binary structures in a way similar to C’s ’struct’, but it’s goes further. It is really handy for keeping track of and getting at useful field-related information you’re likely to want. Things like descriptions, sizes, offsets, and even display format are all available as parts of a hash and through free methods. Extending the hash-based design for say “annotations” is as easy as tying new elements to a field’s hash. Bit-struct also has its own free (if somehwat rudimentary) ‘inspect’ and ‘inspect_detailed’ methods for dissecting your structures in human readable format. There’s a ‘describe’ class method that returns a human-readable description of the format itself. BitStruct has some alignment quirks of its own, but like C’s structs you can work around them.

Take a look at this code. It’s a lot more descriptive than the C struct version. The descriptions are actually part of the code, not comments. The formats for fields are specified right in the structure too!

iSCSI_bitstruct.jpg

Unlike the C version, the rest of the code is really short. So short I’ll paste it all too:

iSCSI_bitstruct_rest.jpg

So here is the iSCSI_dissector in Ruby. Getting BitStruct is a prerequisite ofcourse. Here’s how we run it and what our output looks like:

$ ruby ./iscsi_dissect.rb

-- Header Dissection --
Type: ISCSI_bhs:
Decode:
I bit                = 0
Opcode               = 36
F bit                = 1
Opcode-specific 1    = 0x00
Opcode-specific 2    = 0x00
Opcode-specific 3    = 0x00
TotalAHSLength       = 0
DataSegmentLength    = 85
LUN or Op-specific   = 0x0000000000000000
Initiator Task Tag   = 1
Opcode-specific 4-12 = "377377377377�00�00�00
�02�00�00�00�02�00�00�00�03�00�00�00�00
�00�00�00�00�00�00�00�00"
Data Segment Body    = "TargetName=iqn.2009-08.com.splatomatic:storage.lvm�00TargetAddress=192.168.1.10:3260,1�00�00�00�00"

-- Data Segment Body --
TargetName=iqn.2009-08.com.splatomatic:storage.lvmTargetAddress=192.168.1.10:3260,1

In fairness to C, I obviously didn’t do as much work on displaying the data segment but could have. I let ‘inspect’ take care of it to illustrate what it does (and doesn’t do).
It’s interesting to note that the dissector framework in Wireshark uses an similar notion of “adding fields” not unlike BitStruct’s. The sniffer needs easy and generalized access to a bunch of packet and protocol related meta-information used for displaying lots of different protocols, filtering, and so on. Abstracting fields makes a lot of sense for a sniffer and most of the reasons why match up for protocol reversing and prototyping.

So there you have it. In general, I tend to use Ruby over C for this stuff. It used to be Perl. It’s not because I think I have figured out all the typical implementation gotchas or that I don’t have anything to gain by “getting closer to the wire/implementation” either. It’s more because I’m lazy and/or slammed most of the time. It’s also been fun learning and using Ruby more.

If you prefer C/C++, then use it instead.
Python? great!
Perl? go for it!
LUA? Why not! We all may be doing it soon.
.NET? that’s swell!
Java? … really?
KSH? yea! That might even win you a beer!
WSH? um… wow! ok (but definitely no beer).

Whatever… don’t listen to me or anybody else, just use what you know and/or like and/or want to learn more of.

I guess if there’s been a point to the C versus Ruby thing, it’s that the real key is the end result. Language wars are for people with more free time on their hands than I have. If a language feels like it slows you down, then it’s probably not the right one. I say “get it done” first. If it makes you feel better, just call your early forays “prototyping”. You can always rewrite it in something else once you know how the protocol works. I do sometimes. To an extent, I sometimes think of “C vs. Ruby or Perl” that way, but in practice, I tend to use the four letter languages most of the time.

Anyway, back to the general topic of protocol exploration, and in summary:

For this and the last installment I’ve been talking about iSCSI; a protocol with pretty good RFC documentation and some open source code guiding us along the process of understanding and dissecting PDU’s. I’ll be talking about tackling lesser understood and undocumented protocols more in future posts. If iSCSI had been totally undocumented we’d have to figure a lot out on our own. Frankly, iSCSI could take a while to understand if it weren’t so well documented. It’s a pretty complicated protocol as far as they go. I’ve only scratched the surface by talking about the BHS.

I’ve gotten some kind words and gentle nudges to get me back to writing this from one or two of you and I want to say again “Thanks for the feedback!”. It should not have taken so long to get the 2nd installment out.

Comment Bubble 5 Comments

Exploring Protocols - Part 1

Eric Monti | June 20th, 2007 | Filed Under: Development, Reversing

In the process of doing software security analysis, it is pretty common to encounter unknown network protocols or file formats that are part of the attack surface you’re investigating.
Not too long ago, we wrote a post entitled Reversing a ZLib-obfuscated? Network Protocol where we talked about reversing an undocumented protocol to look for security weaknesses. We got several good questions about some of our deductions about the protocol as we picked it apart. I’d like to take the opportunity to talk more about protocol reversing in general and hopefully help explain how that deduction process works while getting some broader coverage on the subject in.

This will be the first of at least 2 blog posts. I’m going to start by discussing building blocks and see where that takes us. In the early phases of talking about this process, I’m not making a distinction between whether a protocol is “unknown” because of lack of documentation or because it’s simply “unknown to you/me” because we’re unfamiliar with it. Of course an undocumented protocol is going to be tricker to reverse. If there’s a point to these initial posts, it’s that working with documented protocols helps us understand the undocumented ones.

To illustrate some basic protocol dissection ideas, I’m going to talk about iSCSI. I mostly picked iSCSI since I happen to be working with it at the moment and it makes a pretty good case study.

In this post we’ll:
1. Talk a little bit about what iSCSI is and what it’s for.
2. Use Wireshark to find a iSCSI PDU and isolate it.
3. Compare the raw PDU to the specification.
4. Talk a bit about how this all relates to protocol reversing.

In a nutshell iSCSI is:

… SCSI over IP. It’s designed as a low cost solution for network attached storage. A storage server (say a NAS appliance) exports storage as “targets” on any TCP/IP network to which clients (aka “initiators”) connect. Once attached by connecting and logging on, the initiator’s OS sees the target as a hard drive and treats it as a block device. Filesystem drivers ride on top of the device as they would any other SCSI device. Besides file access, an initiator can arbitrarily partition and format the target using its allocated space.

Sounds a bit crazy from a security perspective, right? Well, just bear in mind that that iSCSI is not intended as a replacement for CIFS or NFS at all. iSCSI is first and foremost designed as an alternative to more expensive fiber channel NAS solutions by using cheaper gig-ethernet and possibly leveraging a company’s existing network infrastructure. The iSCSI spec is also apparently designed to be used over other transports besides TCP/IP.

We’re interested in what iSCSI looks like on the wire. This is not undocumented or new territory. Wireshark has iSCSI decoding capabilities way above and beyond the simple dissection tools we’re going to get into for iSCSI. We’re not going to use those decodes much for this discussion, though. Building our own tools gives us more intimate knowledge than relying on Wireshark will. We also want to have some building blocks for doing things later like fault injection if our exploration leads us that way.

iSCSI’s a good case study for protocol exploration since it’s not exactly a “common” network protocol, but has pretty decent documentation and specifications available in RFC’s. Picking it apart with some guidance helps illustrate some common network protocol concepts and we can double-check things against the actual specification to make sure we’re getting them right.

Here’s a hexdump of an isolated iSCSI PDU as it appears on the wire:

pdu-hd.png
I isolated this using Wireshark and saved it as a as a file to work with. iSCSI uses TCP/3260 as its transport. The pcap filter for this is “tcp port 3260″. Here’s how I did that:

Isolating a single TCP Data Segment In Wireshark

Now that we’ve isolated a sample, the next step is making sense out of the raw PDU. If this were an undocumented protocol, this would be the part where we opened it in a hex editor and started trying to separate chunks into boundaries based on educated guesswork, assisted by good conversion tools. Actually that’s just one way. Probably the most basic one.
This involves a lot of educated guesswork and is not always a straightforward process. We’re still talking about the guesswork, not doing it (yet).
Here’s the basic header syntax of an iSCSI PDU as defined in RFC 3720 (yep there it is… we could stop now, but where’s the fun in that)

iscsi-bhs.png

This type of breakout basically represents how we’d like to be able to understand a network protocol. It’s very rare, even at best, that you’ll actually figure out what every field is for in an undocumented protocol. Just getting fields broken up so you can make sense out of most of them is what you’re usually going after initially. As you start to make sense of other things later, the things you may have originally passed over can gain context.

This RFC explains the various fields pretty well and covers much more than just that. There’s more information in there than we are even likely to need. This raises a good point. Before you start “reversing” anything, always make sure it isn’t documented somewhere or implemented in something you can pull apart.
Using the spec to guide us, we’re going to try to understand this header and see what our captured PDU says. We’ll need to write a tool for this.

In the next post, we’ll:
1. Write a C dissector to emulate Wireshark decodes.
2. Write a Ruby dissector to approximate the C version.
3. Discuss some pros and cons of each.
4. Discuss some of the general things we can learn and how they can be applied to reversing truly unknown protocols.

Comment Bubble 7 Comments

Reversing a “ZLib-Obfuscated?” Network Protocol

Eric Monti | May 21st, 2007 | Filed Under: Bitching About Protocols, Reversing, Uncategorized

We just wrapped up a security assessment on a commercial enterprise server/agent security product. I can’t get too specific here, but we did run into an interesting problem that we thought would be worth a post.
The application we were evaluating had a home-grown network protocol doing some interesting things worth investigating. What we were seeing from our network capture wasn’t too far from this:

00  46414b45 02000000  06060601 5e000000  |FAKE............|
10  dab624ba da73fed5  b9872696 08ea97a5  |..$..s....&.....|
20  2d626160 60c86248  61c86748 65e000b2  |-ba``.bHa.gHe...|
30  bd80ac0c 863c0605  0617b098 3450cc99  |.....<......4P..|
40  c18a2186 21802191  a1042817 c3100294  |..!.!.!...(.....|
50  89610806 92b94015  310c6e0c 990c3940  |.a....@.1.n...9@|
60  961e4332 502c8f21  0dc84f67 0000        |..C2P,.!..Og..|
6e

Just by glancing at the first 16 bytes, you can spot (1) a message signature; (2) some 4-byte little-endian word values, one of which was obviously a length value for the payload; and (3) version number of 1.6.6.6 in the middle.

This looked promising so, we decided to pick it apart some more and see where it got us.

Let me just add at this point: General approaches can vary a lot when it comes to reverse engineering. As you’ll see, what we were doing was not strictly just protocol reversing. We had access to server-side binaries, which we were simultaneously disassembling to guide us at several steps. We could have just gone the strict disassembly route, but in my experience combining the two tends to yield much quicker results.

So, away we went. Or rather, got stuck next. Just past the header of the protocol was a chunk of seemingly meaningless binary data. A bit of disassembling told us that it was something compressed with .NET’s DeflateStream. Here was the real payload and it was time to write our first bit of code.

Since we were working with BlackBag (as regular readers will have noticed — Matasano tends to do) our ideal tools would be small focused ones that could run on Unix. Preferably in the middle of a list of several piped commands so we could say things like:

% cat  | _inflate_ | hexdump -C

And if things got interesting, maybe even:

% cat  | _inflate_ | bkb sub  | _deflate_ | bkb blit

We figured, we should be able to get the “Inflated” stream using Zlib. So, we set out to put together some Ruby to take a “deflated” standard input and dump “inflated” standard output.

#!/usr/bin/env ruby

require 'zlib'
buf = STDIN.read()

zs = Zlib::Inflate.new
out = zs.inflate buf

STDOUT.write(out)

And… Fire!

% cat msg.raw |bkb shf 16 | inflate.rb|hd
./inflate.rb:7:in `inflate': incorrect header check (Zlib::DataError)
from ./inflate.rb:7

Woops… maybe not so simple. We asked the Google! Turned out .NET’s DeflateStream doesn’t use the usual ZLIB header and footer as defined in RFC 1950.

Side note: Obviously this had already been tackled. Even though we didn’t try the IronPython solution linked above, I’d probably recommend using it or something like it unless you need something really quick and dirty as we did. The obvious question, is why didn’t we? We were sticking with ruby for other reasons on this session and didn’t really need a “robust” solution just yet.

So we actually read RFC 1950 at this point. Turned out we just needed to tack on the header (and maybe the footer) ourselves.

#!/usr/bin/env ruby
require 'zlib'
header = "x78x01"
buf = STDIN.read()
zs = Zlib::Inflate.new
# Add the header first
zs << header
out = zs.inflate buf
STDOUT.write(out)

Um.. Fire?

$ cat msg.raw |bkb shf 16 |./inflate.rb |hd

00  b6a45b7b 499fd59d  c2917411 2f7666a2  |..[{I.....t./vf.|
10  04000000 6a006400  6f006500 08000000  |....j.d.o.e.....|
20  4a006f00 68006e00  20004400 6f006500  |J.o.h.n. .D.o.e.|
30  1b000000 43003a00  5c005000 61007400  |....C.:..P.a.t.|
40  68005c00 54006f00  5c005300 6f006d00  |h..T.o..S.o.m.|
50  65005c00 46006900  6c006500 2e006300  |e..F.i.l.e...c.|
60  6f006e00 66006900  6700                    |o.n.f.i.g.|
6a

Much better.

Those who’ve read the RFC or are already familiar with ZLib may notice we didn’t bother with the ADLER32 checksum footer. Our quick/dirty Ruby ZLib implementation didn’t seem to notice when it was missing. Honestly not sure whether this is expected behavior or not, but it suited us just fine. We really just wanted to get back to picking apart the protocol.

What was “inflated” might also need to get “deflated” again, so we also whipped up a “deflater”.

#!/usr/bin/env ruby
require 'zlib'
buf = STDIN.read()
zs = Zlib::Deflate.new()
out =  zs.deflate(buf,Zlib::SYNC_FLUSH)
# Output the deflated chunk without the 2b zlib header and 4b adler32 footer
STDOUT.write(dst[2,(dst.length - 6)])

Turned out we didn’t need to use the “deflate” script much: between protocol decoding and disassembly, we learned one of the original uncompressed 4-bytes in the protocol’s header was for payload *type*, either *deflated* or *raw*. So, even though we confirmed our deflater worked well enough, we usually just changed the type to *raw* whenever we wanted to send something back to the server.

And in conclusion (which I have to speak in vague terms about to protect the guilty - sorry). Now that we could read and compose messages, we learned this protocol was letting the agent do some truly crazy things. Things like, passing entire lists of fields to insert/update directly into SQL. Without authentication.

Identifying and decompressing the protocol’s payload was the only hurdle we had to get over to proceed with other attacks. In the end, these culminated in several findings, including trivial database corruption and even injecting malicious data to capture admin privileges through the product’s console. Again… without authentication.

Moral of the story:

I try to not to speculate too much about what developers’ intentions are or were when I find something like this. Hindsight is 20/20 and it’s generally a lot easier to break than build. But, I couldn’t help but wonder whether they had intended to use DeflateStream as a cheap form of obfuscation here. It’s just as possible they just wanted to keep the payloads small and didn’t even consider the risks faced by the protocol at all.

Zlib is not encryption (I feel dumb even saying it). Even more so if your protocol is wide open. Authentication would have been tricky no matter what. There were inherent trust boundaries invading way into the agent. That was even more reason for this protocol to use encryption. Though frankly it wouldn’t have solved all this protocol’s problems — crypto is not an argent projectile. There were some deeper design issues lurking here.

But at the very least, it would have raised the bar for reversing. Because the ZLib “hurdle” took us all of about 20 minutes to beat.

Comment Bubble 24 Comments

MITMing an SSLized Java App

Dave G. | May 1st, 2007 | Filed Under: Bitching About Protocols, Reversing

I was recently working on a Java-based application that communicated exclusively over SSL. This is a good thing for the application, but a bad thing for someone trying to test it. I naively thought that I could edit a couple of files and boom, be on my way. Alas, what follows is what I had to do to get in between and start understanding the application:

My initial take was that I would use two instances of stunnel (I use 3.x because I am old, crusty, and like the simplicity of the 3.x command line interface), with Blackbag’s replug in between so I can view the traffic.

So, here is my simple setup (monitoring a connection to www.amazon.com as an example):

stunnel-replug

All I do is test it with another instance of stunnel and we see that traffic is passing through our rat’s nest of proxying.

Of course, the first wrinkle is that I need to actually test from Windows.
So we load up Parallels and get the application pointing at our tunnels. Easiest way to do this is to just edit the hosts file, located at \WINDOWS\System32\Drivers\etc. Looks just like /etc/hosts, so we just add:

192.168.2.2          alas.we.cant.share.this.com

Which should work as long as IP addresses aren’t hardcoded into the application. Thankfully, they weren’t. Now I just run the app, and everything works.

Except that it doesn’t work at all. Turns out that the JVM does certificate validation. When web browsers encounter certificate problems, they let the user decide if they want to continue to connect. When just about any other kind of application encounters validation problems, they will just fail.

There are two basic validation steps it will perform:

  1. Valid Certificate: Is this signed by someone that we trust (e.g. Verisign)?

  2. Hostname: Does the hostname inside of the certificate match the hostname we are trying to connect?

This is where I feel like I must have missed something. I can’t find anywhere inside of the Java documentation where I can disable this using JVM configuration files. After poking around for a bit, and using google, I see two options.

  1. Install a new certificate into the instance of Java I am using
  2. Decompile, edit and recompile (you can turn off certificate validation programmatically)

So, I go with Option #1. I start by generating a new certificate using the stunnel Makefile (you can just use OpenSSL):

make a cert

The most important parameter you are setting is the Common Name. It must match the server that our application thinks that it is going to communicate with.

Then I cheat and use InstallCert, a Java app written by Andreas Sterbenz who also beat his head against Java certificate validation issues. The neat thing about InstallCert is that you just give it a URL and it will grab the cert and create a keystore you can use. You just copy over your old cacerts file with the newly created one, and your cert will now be accepted for future use. We just point InstallCert at our stunnel instance:

InstallCert-start

You can safely ignore the SSLHandshake errors. You should really run InstallCert on the machine you intend to test from (in my case the Windows box), but it was faster for me to create screencaps in OS X. Then InstallCert will ask you to if you want to accept the cert:

InstallCert-confirm

InstallCert will create a file called jssecacerts in the current working directory. You will want to copy that file over the actual keystore usually located in JAVAPATH/lib/security/cacerts.

Note: Backup the cacerts file before overwriting it.

This is where someone comments, “but all you had to do was edit java.security to say XXX and you could have disabled certificate validation”, and where I wish there was a phrase that meant both Thank You and I Hate You. If it turns out that I am correct, then I have to ask: Why on Earth isn’t this a configuration file option?

Comment Bubble 15 Comments

Analyzing Mac OS X Applications 101: CrashReporter and Malloc

Dave G. | April 24th, 2007 | Filed Under: Apple, Reversing

Believe it or not, we don’t just talk about OS X security, we actively search for new vulnerabilities in it. I have found a number of vulnerabilities, both local and remote in OSX and OSX Server over the years. This post is the start of a series of posts that explain some of the tricks that we use when assessing applications on Mac OS X.

For the most part, these tips apply to both GUI and command line apps. This isn’t rocket science, but is a good primer for people looking to dive into OSX vulnerability analysis. I am going to use Safari as an example, since it is somewhat topical. It isn’t the best example since enough of it is opensource that you can gain a lot more insight via debug builds.

  1. CrashReporter is probably the most useful utility for identifying that an application crashed, and getting some basic insight into what caused the exception. You have probably seen elements of CrashReporter when an application crashes on you. While it can be customized on a per application basis, most applications use the default. Here is an example of a CrashReporter dialog:

CrashReporter Dialog Window

If you click on Report… you will see information on CrashType. This will be your first potential clue into what might have caused the crash. Especially if you see a BAD_ACCESS along with an KERN_INVALID_ACCESS that you control. In this case, it is not obvious that I control this value, although it does look like it is an ASCII value:

CrashReporter CrashType-Reduced

Another bad sign is if you see EXC_BAD_INSTRUCTION. Under x86 it is less likely you will see this with a straight up vanilla stack overflow, but there are other forms of memory corruption that can elicit this error.

And scrolling down, you will see register state. There is more information that is useful, including which thread you crashed, as well as which function caused the exception.

CrashReporterRegisters-reduced

One neat OS X feature is the ability to change this behavior via the defaults command. By settings the CrashReporter DialogType to “developer”, you will now have the option to attach to the process via GDB, by clicking a button.

In a Terminal window, execute:

defaults write com.apple.CrashReporter DialogType developer

Now lets look at that same crash:

CrashReporterWAttach-Redacted.jpg

Note: This won’t work on basic command line UNIX applications.

If you aren’t using the GUI, you can also monitor files in ~/Library/Logs/CrashReporter or /Library/Logs/CrashReporter. Files will be named .crash.log if CrashReporter can figure out the name of the application that crashed. Otherwise you will see names like ???.crash.log or Exited process.crash.log. Finally, if you are fuzzing the kernel, you can find information on kernel panics in /Library/Logs/panic.log.

  1. Malloc Debugging

The OS X malloc implementation contains a lot of useful features for understanding when (and how) memory corruption is occuring. These features are controlled via environment variables. These variables are both outlined inside of the malloc man page and via setting the environment variable “MallocHelp” and executing a command:

MallocHelpNew1.png

Before I go further, if you are interested in understanding more about exploiting Apple’s malloc() implementation, you must read FelineMenace’s paper on the subject (They also discuss the MallocDebug facilities).

The variables I set are:

MallocLogFile The default behavior for the debugging modes for malloc output to stderr. For GUI apps, setting MallocLogFile makes it much easier to monitor the output. You simply set the environment variable to the name of the file you want output written to.

ThisOldVuln Moment #1: There was a vulnerability associated with this environment variable where a user with an account on the system could overwrite arbitrary files and become root.

MallocGuardEdges Helps identify potential buffer overflows by putting guard pages on either side of large blocks. It won’t detect all issues, but it can help.

MallocScribble Useful for detecting when an application is writing to an region of memory that has already been freed or memory that was never initialized. Both of these conditions can be exploitable security issues.

MallocBadFreeAbort While a simple double free or freeing non malloc()d regions of memory are generally not considered exploitable conditions under OS X, there are more complicated double free() situations that can be exploitable. Besides, in 1996 everyone I knew said that heap overflows were unexploitable.

MallocCheckHeapStart, MallocCheckHeapEach, MallocCheckHeapAbort, MallocCheckHeapSleep

These three variables control when to start checking for heap corruption, how often to check, and what to do when heap corruption is identified. I would generally recommend Abort while fuzzing. When you want to debug, switch to sleep so that you can attach at the point that corruption was identified.

Here is my Info.plist file for Safari:

Saf Plist

As you noticed from the above, somethings caused Safari to crash. Figuring out what caused the crash isn’t always easy, especially with complex applications like Safari. In this case, we get another potential clue via the malloc:

debugoutput1.png

Minor Note: The output was edited down.

Less Minor Note: 1.5G is a large number to get passed to vm_allocate (assuming the output here is correct).

This will be a living document (even though I hate that term). I know smarter people read this blog and would love feedback. Future topics will include file system access, deeper dive on debugging facilities and automation.

Comment Bubble 3 Comments

Refreshing Change Of Pace: Actual Technical Discussions at Nate’s Blog

Thomas Ptacek | April 22nd, 2007 | Filed Under: Reversing, Uncategorized

You came for the Mac soap opera. You left to read about anti-debugging/anti-reversing technology and driver reverse engineering using virtual machines at Nate’s blog. As usual, Nate’s posts are not to be missed. See you there.

Comment Bubble 1 Comment

Is Win32 A Debugging API? If Not, How Close Is It?

Thomas Ptacek | February 26th, 2007 | Filed Under: Reversing, Uncategorized

Assume a black-box pen-test with a Win32 target that has perfect debugger detection (disregarding how hard “perfect” is to achieve). Arbitrarily, assume no access to the kernel; in fact, no administrator privilege at all. We simply run in processes alongside the target with the same credentials.

How much control do we have over the target?

  • We can get a Win32 handle to the process with OpenProcess.

  • We can read process memory by virtual offset and length with ReadProcessMemory

  • We can enumerate the threads in the target with Toolhelp32

  • We can suspend or resume individual threads with OpenThread, SuspendThread, and ResumeThread

  • We can write process memory by virtual offset and length with WriteProcessMemory

  • We can allocate memory within the target with VirtualAlloc

  • We can change memory protection with VirtualProtectEx

  • We can enumerate modules and offsets within the target with Toolhelp32.

  • We can map out memory regions with VirtualQueryEx.

  • We can excute code in the context of the process with CreateRemoteThread (and RtlRemoteCall).

How much control would a debugger have given us?

  • We’d be able to suspend and resume threads, which we can do anyways.

  • We’d be able to read and write memory, which we can do anyways.

  • We’d be able to set breakpoints.

  • We’d be able to single-step the program.

  • We’d be able to read register contents.

  • We’d be able to call functions, which we can do anyways.

  • We’d be able to search memory for strings, which we can do anyways.

Without anything more interesting than the MSDN man pages, we come reasonably close to this without invoking the debug interface. In exchange for not showing up in NtQuerySystemInformation or mucking with the UEF, we give up easy-to-use breakpoints, single-stepping, and register access. But:

  • We can trivially get single-use, nonrecoverable breakpoints; just inject INT 3 into the text with WriteProcessMemory.

  • We can potentially get access to registers using NtSetThreadContext.

  • We can hook functions, in all the usual ways.

How close can we get to a fully-functional debugger without Windows knowing about it? I’ve been playing with this for a couple weeks, using Python and ctypes, turning IDLE into a debugger prompt. And it seems like the answer is “extremely close”. More later, but if this is totally obvious, pointers, links, and comments are welcome.

Comment Bubble 5 Comments

BinNavi Traces IOS and ScreenOS. It’s On, Yo.

Thomas Ptacek | January 24th, 2007 | Filed Under: Reversing, Uncategorized

The new version of Sabre’s BinNavi can single-step, trace, and graphically navigate Cisco IOS and NetScreen ScreenOS. Without source. I don’t have much to add to this story. Tally ho!

Comment Bubble 1 Comment

Mystery Vulnerability Theater 3000: Part I

Dave G. | December 12th, 2006 | Filed Under: Reversing

Tom was telling me about a really quick way to do risk assessments on memory trespass/register control vulnerabilities. As it happens, the next day I found myself controlling some registers in an application, and figured I would see where it got me.

The basic idea is that when you find a vulnerability and don’t have the time to confirm exploitability via an exploit, you can mock it up in a debugger by setting breakpoints just before the instruction that causes the exception, and resetting the register you control to a valid memory address (I chose the stack).


sitio:~ root# !FOO
FOO=`ps -auxwww | grep XXXXX | grep -v grep | awk ‘{ print $2 }’` ; gdb -pid $FOO XXXXX
GNU gdb 6.1-20040303 (Apple version gdb-384) (Mon Mar 21 00:05:26 GMT 2005)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details.
This GDB was configured as “powerpc-apple-darwin”…
Attaching to program: `XXXXX’, process 23276.
Reading symbols for shared libraries …………………………………………………………………………………………. done
Reading symbols for shared libraries ……………… done
0×9000ab48 in mach_msg_trap ()
(gdb) cont
Continuing.

We have attached and the process is running as normal. Lets see what happens when we send some bad input:


Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0×62626262
[Switching to process 23276 thread 0×7103]

0×62626262, eh? BBBBadness.


0×0224f864 in SomeRandomFunction ()
(gdb) bt
#0 0×0224f864 in SomeRandomFunction ()
#1 0×02257d2c in TotallyRandomFunction ()
#2 0×022570c0 in ReallyRandomFunction ()
Cannot access memory at address 0×61626364
Cannot access memory at address 0×6162636c

Yah I know, the 0×61626364’s look tempting… just ignore them.


(gdb) disas _SomeRandomFunction
Dump of assembler code for function _SomeRandomFunction:
0×0224f84c <_SomeRandomFunction+0>: mflr r0
0×0224f850 <_SomeRandomFunction+4>: stmw r29,-12(r1)
0×0224f854 <_SomeRandomFunction+8>: stw r0,8(r1)
0×0224f858 <_SomeRandomFunction+12>: mr r29,r3
0×0224f85c <_SomeRandomFunction+16>: stwu r1,-80(r1)
0×0224f860 <_SomeRandomFunction+20>: li r3,0
0×0224f864 <_SomeRandomFunction+24>: lwz r4,0(r29)
0×0224f868 <_SomeRandomFunction+28>: bl 0×22547f4
0×0224f86c <_SomeRandomFunction+32>: lwz r0,88(r1)
0×0224f870 <_SomeRandomFunction+36>: addi r1,r1,80
0×0224f874 <_SomeRandomFunction+40>: mr r4,r29
0×0224f878 <_SomeRandomFunction+44>: mtlr r0
0×0224f87c <_SomeRandomFunction+48>: li r3,0
0×0224f880 <_SomeRandomFunction+52>: lmw r29,-12(r1)
0×0224f884 <_SomeRandomFunction+56>: b 0×22547f4
End of assembler dump.
(gdb) quit
The program is running. Quit anyway (and detach it)? (y or n) y
Detaching from program: `XXXXX’, process 23276 thread 0×7103.

Looks like we grabbed control of a register. But it is control of r29. Since we are lazy, and want to see if we can quickly confirm exploitability, lets try something:


sitio:~ root# FOO=`ps -auxwww | grep XXXXX | grep -v grep | awk ‘{ print $2 }’` ; gdb -pid $FOO XXXXX
GNU gdb 6.1-20040303 (Apple version gdb-384) (Mon Mar 21 00:05:26 GMT 2005)

Attaching to program: `XXXXX’, process 23324.

0×9000ab48 in mach_msg_trap ()
(gdb) break *0×0224f860
Breakpoint 1 at 0×224f860
(gdb) cont
Continuing.
[Switching to process 23324 thread 0×8403]

Same as before… only this time we set our break point to one instruction prior to the invalid access prior to continuing execution. Now we send the same malicious input, and…


Breakpoint 1, 0×0224f860 in SomeRandomFunction ()
(gdb) x/2i 0×0224f860
0×224f860 <_SomeRandomFunction+20>: li r3,0
0×224f864 <_SomeRandomFunction+24>: lwz r4,0(r29)

(gdb) set $r29=0xbffffd30
(gdb) cont
Continuing.

Now that we have hit our breakpoint, we modify the value of r29 back to a valid memory address and continue. Lets see what happens next…


Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0×6262626a
0×02257f6c in SomeWhereElseRandom ()
(gdb) bt
#0 0×02257f6c in SomeWhereElseRandom ()
#1 0×02257d44 in TotallyRandomFunction ()
#2 0×022570c0 in ReallyRandomFunction ()
Cannot access memory at address 0×61626364
Cannot access memory at address 0×6162636c
(gdb)

Oh look, we’ve moved past it. Lets check out our execution window:


XXXXX(23324,0×2842a00) malloc: *** Deallocation of a pointer
not malloced: 0×64676f6c; This could be a double free(), or
free() called with the middle of an allocated block; Try setting
environment variable MallocHelp to see tools to help debug
XXXXX(23324,0×2842a00) malloc: *** Deallocation of a pointer
not malloced: 0xbffffd30; This could be a double free(), or
free() called with the middle of an allocated block; Try setting
environment variable MallocHelp to see tools to help debug

Alas, calls to free() on OS X are validated to make sure they are pointing to currently malloc()ed memory. This doesn’t necessarily mean that what I have found isn’t exploitable. But I am not off to a good start.

What I like about this technique is that you can confirm whether something is exploitable quickly.

What I dislike about this technique is that you can’t confirm something isn’t exploitable. For example, assuming that it was a double free implementation that I could exploit, the application in question tosses my user input through an isalpha(). I wouldn’t have known that by solely relying on this trick.

Live and learn…

Comment Bubble 6 Comments

Who We Are

Matasano is a team of internationally respected security experts who have led security efforts at @stake, Microsoft, ISS, Secure Computing, Arbor Networks, Secure Networks, Bloomberg, Sandia Labs, and others. Read more about our team and how we can help you today.