Race To Zero: It’s Not A Contest, It’s A Protest
Race To Zero is an event that pits hacker-types against an array of AV products. Unofficially hosted at DEFCON this year, it has already sparked the ire of the AV community. This makes sense as we all know that there is little they can do to stop researchers from writing malware that will be undetectable (until their next update). From their perspective, it is a waste of time. And that is somewhat true. Especially their time.
This type of event, along with the Consumer Reports test of 2006, runs the risk of wasting the AV community’s time. Which if we all recall, had no negative impact on society (or even AV vendors). Even still, I acknowledge it is a pain in the ass for them. A combination of bad press, plus a bunch of really crappy malware samples that have to documented, analyzed, detected and removed even though they will most likely never, ever impact a person outside of a lab environment.
The idea that the AV company’s are getting free research is pretty ludicrous. All that happens is that they will have to analyze as many of these modified viruses to figure out how to detect them. It is just another day at the office.
Which gets to the heart of the matter:
This contest isn’t a contest. This contest is a protest. It is a protest against the fact that there is simply not enough innovation in the anti-malware space. The problem is getting worse and all of the solutions appear to come from the same tunnel-vision line of thought. The vendors that do this have successful businesses that run just fine. New malware will get fixed with the same old solution.
The take-away isn’t going to be research that will help the AV industry to see emerging techniques. It will be that there has has to be another way. Events like this should inspire someone fresh to come in and build a better mousetrap, and build the next MFE or SYMC.
13 commentsBlackBag 0.9.1 - New link and minor fixes
It seems our old link to Black Bag on here went bad some time ago. We’ve been getting lots of requests for a new link.
P.S. Thanks to Marcin, for pointing us at sockpuppet. Nobody at Matasano could seem to remember where we’d seen it last!
You may notice the minor version number bumped. In the process of digging up a working tarball, I took the opportunity to make two very trivial tweaks:
- Fixed a small bug in tsec.c that was causing “make” to fail.
- Added offsets to deeze’s output (culled from the silly little patch I mentioned in my last post)
Retsaot is Toaster, Reversed: Quick ‘n Dirty Firmware Reversing
I recently worked on a project that involved embedded systems and reverse engineering. This sort of territory can be a little hairy the first few times out. I ran into some interesting challenges and discoveries along the way which I thought might be worth writing a little bit about. I can’t tell you what the target was. But, it was important. And, we beat the crap out of it. So instead, I’ll tell you what I wish it was: a networked 4-slot toaster.
Now… to make things interesting; Early on, I’d discovered a vulnerability in the toaster that allowed any attacker to load their own firmware on the device. Ouch! My toast! My beautiful toast!
In order to drive home the risk (mostly to the vendor) of the firmware loading vulnerability, I was asked by my customer (also the vendor’s customer) to demonstrate the attack by actually loading malicious firmware onto the device and getting it to run.
Mind you, the request to prove this is actually pretty sane. I had little knowledge of the boot loader, or even of the firmware image format. I couldn’t say for sure that there wasn’t a code-signing feature, which would prevent the toaster from loading any image that wasn’t cryptographically signed by the vendor. That would have rendered the firmware loading attack impotent. To make things worse, the vendor was being pretty light on details. Can’t say I blame them.
So… I was tasked with demonstrating that the vendor’s firmware:
- Could be reverse engineered
- Modified
- Loaded back onto the device
- By an attacker
Think “embedded rootkit”. Embedded rootkits are sort of a holy grail: it’s easy to load a rootkit on a PC, but tricky to get them onto a wireless router or switch (or toaster). The reward for doing it, though, is that nobody ever thinks to check their router or toaster for rootkits. Initially, my bar was just going to be to demonstrate I’d actually changed some aspect of the program with my own firmware patch and move on to more penetration testing.
.
Some embedded projects are easier than others. Sometimes you get lucky. You find a debug shell, a file-system, or a serial console. A lot of times, devices that look like black boxes have JTAG ports, to which you can attach a debugger. Devices built in the last 10 years tend to have GDB stubs, so you can target them with cross-debuggers. Some even have firewire, through which you can DMA in and out of system memory.
This wasn’t any of those. Without reversing the firmware, I had no way to execute code on the device. For the purposes of penetration testing, I really couldn’t even see what was actually happening on my target. I hate that. All I had to work with was the firmware image file itself, offline, and some observable external behaviors of the device.
So… if I was going to make any of my objectives happen, I was basically starting purely from static analysis. Back in 2005, Tom wrote about a similar set of steps he took given similar starting point. Without realizing it this time out, I actually followed them almost to the letter. There were a few points where our paths diverged slightly because of differences in our target and approaches.
.
This is my story. Details have been fuzzed to protect the guilty toaster.
1. In Which I Get The Image
I start out with a couple megs worth of firmware image file. This file is sent to the device when you do a firmware upgrade the official way. You can usually find firmware images on the vendor’s download site, or on the CDs that came in the box.
2. In Which I Scan The Image
I feed it to deezee. Deezee is part of Matasano’s homegrown “Black Bag” toolkit. What deezee does is search through a binary file for compressed data by looking for zlib signatures, and then extracting anything it finds. Its crazy how often this seems to work on unknown binary files. As it turns out, zlib is the industry standard compression format, even for toasters.
Sure enough, deezee finds not one, but 3 distinct blobs of compressed data packed in the file. Hmm… really… deezee should tell me where it finds these things I think. Lets fix that now. Much better.
Edit: Here’s the link to blackbag with patch applied.
3. In Which I Read
With addresses for compressed chunks in hand, I open up the file in my hex editor.
Deezee found the first compressed segment around 384 bytes from the beginning of the file. Here’s where technique comes into the picture.
Deezee found three “hits” in the image. I assume they aren’t just sprinkled haphazardly throughout the image; there should be some kind of header on them. I want to know what that header looks like. Maybe it’s used for things besides compressed blobs? The way I’m going to do that is, I’m going to compare the bytes surrounding the compressed blobs for all three hits.
So, comparing the preceding 384 bytes of that and the other two hits, I see several similarities:
- A 4-byte signature. I can tell because it’s always the same four bytes, in the same place relative to the blob.
- An ASCIIZ string, apparently describing the firmware version, padded to 128 bytes. You see “padded strings” all the time in binaries; what they are is a C-struct with a “char version[128]”.
- A 16-byte ASCIIZ string of numbers describing just the version, again padded with NULs.
- 4 bytes which, when unpacked as an integer, happens to match the size of the compressed chunk. (If your hex editor won’t tell you the big and little endian 32 bit value of four highlighted bytes, get a new hex editor). This is the trick that cracks most image formats: you look for 16 or 32 bit fields that look like lengths, and try to reconcile them to the file and its features.
- 4 bytes which… hmm… could be CRC32 checksum? This is a bit of a shot in the dark, but it’s easy enough to check:
$ cat app.bin | ruby -e \ 'require "zlib"; puts Zlib.crc32( STDIN.read ).to_s(16)' a76ea2ad
And… they match! I’ll need to keep this checksum in mind when I try modifying the file. Bodes well for no code signing!
Why did I try a CRC32 checksum first? Well… first off I assumed the field was 32 bits long. And CRC’s (Cyclic Redundancy Check) are used a lot in file formats that encapsulate other files. As you can see above, I used Zlib’s crc32 function to check my file chunk. Could just easily have been OpenSSL’s since it too is a standard Ruby library. Two shining examples of how common CRC32 is.
4. In Which Other Headers Are Found
Now that I know what to look for, I find out that this header also prefixes some other chunks. Ones that aren’t compressed and so weren’t noticed by deezee. I also confirm that CRC32 header field for each chunk and it’s used on the other’s too. Five total, including the compressed ones. The last one is a big chunk of base64 which looks like it might be an encryption signature. Hmm… might be used for code signing somehow? Looks like I still need to confirm or discount this possibility.
5. In Which I Check For Metadata
Go back and look again at the original chunks of compressed data. The zlib headers in the file included the original filenames which give me some clues as to what each is. Going on filenames, I’m thinking I’ve got phase one and two boot-loaders and the third chunk is the actual app. I’m interested in the application. The Unix binutils ‘file’ command doesn’t tell me anything useful about any of them, but it’s always worth a shot.
6. Behold: Low Hanging Fruit
Strings the application file using GNU strings with ‘-t x’ so I get nice hex offsets for the where the strings are found. Lots of interesting stuff:
- Uh… well there’s this:
Nice ASCII art. Just a wild guess, but I think maybe this is VxWorks? - The regular string ‘vxworks’ shows up in a lot more places too.
- “GNU ld version 2.9-mips3264-010729”. Ok so it’s MIPS. This is also good news. It means that somewhere out there, there’s probably a GNU binutils distribution that can understand this file format.
- Lots of what looks like function names close together. When do function names occur in compiled output? Two reasons: debugging code, or a symbol table. A symbol table would be a score; I make a note of that for later.
- Lots of assertion messages with references to source code line numbers xxx.c(###). Always handy. Even if you don’t have symbols for functions, with an hour of data entry in your disassembler you can usually fake them based on debug and assertion strings. You want to take the time to do this. By the time you’ve guessed even 10% of the symbols in an image, you’re usually going to be able to comprehend 90% of what the image does, without reading any assembly.
- The strings you see when the device boots up.
- Lots of error, exception, and status messages such and one typically finds in a program. Sometimes very handy as we’ll soon see.
7. In Which Firmware Is Patched
I’ve got a pretty good idea of what the program is and how it’s rolled up in the firmware image file.
My job right now is to figure out whether I can change the firmware and load it onto the toaster. So I make a simple change to it: there’s a string which gets displayed when the system boots up. Using my hex editor, I change it to something noticeably different, being careful to keep it the same size as the original. Then I re-compress it at the same level as the original, re-roll it into the header format, and change the CRC32 checksum.
I load it onto my target using the bug I found… and… sweet satisfaction! If there’s any code signing here, it doesn’t work. That spooky looking signature at the end is for something else. Knowing this, It’s well worth the effort to keep reversing the image.
8. In Which A Loader Offset Is Sought
Now, the binutils ‘file’ command didnt recognize the compressed image, nor did it recognize either of the apparent “boot-loaders”. If it was a well-known format like ELF or COFF, it would have.
Unfortunately, the executable headers are what tell us how to load the whole program into memory. Without a known executable format, I’ll need to find out the load offsets myself. If I don’t, when I load the program into a disassembler, none of the data and instruction offsets will make sense. You can read a binary image with broken offsets, but it’s not fun.
Headers are usually located at the beginning of the file. But I’m not seeing any strong indications of any sort of header information at all in this one. My guess: an older version of VxWorks, predating VxWorks ELF. On the bright side, firmware images aren’t usually relocatable. Unlike a Windows PE program, which is literally edited and patched up by the linker at runtime, firmware tends to be loaded at a specific address.
I could try reversing either or both of the bootloaders to find that address, but… I might just run into the same problem walking all the way back up to the first boot-loader. To be honest, I’m not really interested in unlocking the secrets of the toaster boot loader. If I was trying to jailbreak the toaster to heat unauthorized off-brand Pop Tarts, maybe I would be. Exercise for the reader.
9. In Which I Infer A Loader Offset
I’ve got another idea. Lets look closer at some of these strings and their surrounding data. I scroll around in my hex editor around the sections with strings until I’ve found what I’m looking for.

What we’re looking for is a list of strings preceded by their address table. At this point, I really don’t care what the strings are, just that they are recognizable text. The addresses prefixing the strings in the table are 32-bit big endian numbers pointing to the ASCIIZ strings. Only parts of the addresses match up to real addresses in the file, though, and this is key.The first entry (blue) points to 0x104DEDD8. The string actually lives at 0x004D9DD8 in the file. Subtract the two and you get 0x10005000 - the load offset for this segment.
If I’m unsure of my assumptions here, I check some of the other entries in the table (red, green, and brown). Same pattern holds. We’ve got a winner.
10. In Which I Sanity Check The Offset
Search for 0x10005000 and 0x5000 near the beginning of the file. I want to confirm my assumptions about the lack of a header. Nothing comes up. I’m still very certain this is my winner. The lack of a search hit here suggests again that the loader decides where to load everything, though this program’s compiler knew where that would be when it was compiled too.
11. In Which I Dissassemble
We are now at step 11 and it is time to load the file in a disassembler.
We use IDA Pro. IDA wants to know the architecture of the file. We choose a processor type of “mipsb”; that’s MIPS, which we learned from scanning the image format for strings, and running in big-endian mode.
How do we know it’s big endian? Big endian is a safe guess for non-X86 architectures, but in this case it’s more than a guess: the addresses and lengths in the file were big endian. What if we hadn’t seen a string identifying it as MIPS? I’d probably take a fragment of the file and feed it through “objdump -d” multiple times, specifying MIPS, PowerPC, ARM, and SPARC, in that order. You know you have a match when the instructions sort of make sense.
I let it load as one big segment. Once loaded in IDA, I go to “Edit -> Segments -> Rebase Segment” and enter the address I came up with in step #9. Rebasing tells the disassembler what the load offset is.
- Here I have to tell you about Thomas’ irrational fear of IDA’s rebasing. According to Tom this is based on a bad experience rebasing a 10 megabyte firmware image, waiting a day for it to finish, and only then noticing he got the address wrong. Here’s a handy trick Thomas uses instead of rebasing: feed the file to binutils “objcopy”. Objcopy is a best-kept-secret for reversing: you can feed it a file of type “binary”, and tell it to spit out an ELF file, specifying the header values. Several other tools suck at handling raw files, but rock at handling ELF. Why is his fear irrational? Well… I just disable IDA’s auto-analyze feature before I rebase. Then I do some manual poking around first to make sure I’m on the right track before turning it back on again. But… nobody tell Thomas this! He always comes up with crazy cool tricks using other tools when he gets pissed off at IDA.
12. In Which We Work Out A Symbol Table
Based on where I found strings in the image, I have a pretty good idea of the general region of memory where program data lives. Remember those function names I filed for later? Now is later.
I take the addresses of a few function name strings I find close together and search the file for the four bytes making up their (newly rebased) addresses. I’m consistently getting search hits near the end of my probable data segment, all in the same general area. Look around there and start converting every 4-byte aligned chunk starting with 0x104 to an offset. I start seeing offsets to strings of what looks like function names at 16-byte intervals. The string offsets are right next to offsets pointing way back into the code area and some pointing into the data area. I look a little closer at the hex dump of this area:

Blue is what points to the “name” of the symbol, red points to the actual thing in memory. The last column is the “type” of thing. Green things (0x500) are functions, and purple (0x700) are data. There’s also some oddballs strewn about with a type of 0x900. These point way out of bounds past the end of my rebased file. Could be a segment I don’t know about from this file, or it could just be something else created at run-time. I don’t stress about this, stick with what I know for sure right now.
I take this pattern and translate it into a structure for IDA, then find where the table begins and ends based on the pattern. This is an array, so that’s how I define it in IDA. The last element of the array is 16 bytes of 0x00 helping me (and IDA) see where it ends.
13. In Which I Import The Table Into IDA
Things start getting recognized quickly by IDA’s auto-analysis once I tell it about all these symbol offsets. But I also want IDA to know the names of everything in the symbol table.
I write a quick Ruby script to write an IDC script to do this work for me. IDC is the scripting language built into IDA; in the 21st century, most people write their IDA tools in Python or Ruby, but it’s sometimes faster to write short scripts that use IDC as a half-way format.Run ruby… run IDC… take a look at the results. Awesome! Just about everything was identified in this symbol table! Furthermore, everything was identified in the same segment off of one load address.
Just to make sure about that missing header I check my final results from the symbol table. Indeed, two of the function symbols point right at 0x10005000, the beginning of the program. One’s called ‘sysInit’ and the other ‘start’. There can’t be a header there, it’s a function. Though I can tell as much from the name “start”, a little googling and research tells me “sysInit” is what VxWorks usually uses as its program entry point.
.
Besides proving the firmware loading risk, I had what was shaping up as a target rich source of vulnerabilities on my hands. I really wanted a look at the code running on the device for the purposes of additional vulnerability testing. Now that I had readable, cross-referenced disassembly, I had a lot more to work with for finding and confirming other vulnerabilities both already found, or that I might find as I moved forward.
As Tom put it, “hilarity ensued”. Truly, vulnerability research is just as good in ‘08 as it was in ‘05. Unfortunately, I can’t talk specific details from there, but I wanted to share with our readers the road I took to this point.
Getting a full reversing of an embedded system like this is pretty satisfying and can open up lots of possibilities. The DD-WRT, OpenWRT, and NLSU2-Linux Unslung crowds have gone nuts with this sort of thing with pretty impressive results putting Linux distros on just about anything they can get their hands on.
I’d love to hear about some of your own embedded system reversing experiences.
29 commentsChiSec 17 is NEXT WEDNESDAY.
ChiSec is the single best gathering of security professionals in the Chicago metro area: it’s free of charge, free of vendors, and free of membership. You just show up, and so do other people, and somehow, by the power of the long tail cluetrain infoconomy, the whole thing works itself out, as if some mysterious tipping “point”, aided by the wisdom of the crowd and the power of thinking without thinking, is propelling it towards a freakonomic logic of life that is made to stick.
Where was I? Moneyball! No, wait: the location. It’s at Houlihan’s on Wacker, which is on the corner of Wacker and Michigan. This ChiSec only: a sure-to-be-exciting discussion about why we continue to have ChiSec at a Houlihan’s. Come armed with suggestions for alternatives!
ChiSec is next Wednesday. You do not need to RSVP.
3 commentsOf course you’d rather intern with Matasano!
Are you a student looking for some experience in the information security field?
Why, yes!
Consider an internship with Matasano, in Chicago or New York. This is a paid position.
Sounds interesting. I’ve interned for security companies in the past, and got experience making copies of TPS reports, delivering mail, and even providing back massages to senior partners. What can I expect from you?
At Matasano, you can expect to do those things too. But you can also expect to:
Learn or hone reverse engineering skills
Research vulnerabilities in high-profile software
Find zero-day vulnerabilities and never talk about them!
Write reversing and security testing tools in fun languages like Ruby or ok wait just Ruby.
Not sold yet?
No.
Consider some of the projects our interns have worked on: web applications your mother has heard of, plus many that she hasn’t! Hardware and RTOS systems built for CPUs that are documented only in secret binutils distributions from India! Popular cryptosystems deployed throughout the Fortune 500!
What’s an RTOS?
Exactly! Consider whether you’re going to learn more with us than at any other internship:
You’ll do vulnerability research work almost exclusively.
You’ll likely get a diverse set of targets, from Win32 to custom embedded platforms.
You’ll have opportunities to work at a very low level (for instance, firmware and chipsets) and at very high levels (for instance, AJAX toolkits).
You’ll get a chance to develop and promote new security tools and techniques.
But I don’t know how to do most of this stuff, Thomas.
Can you code?
Sure, in Python.
Are you… interested in any of that RTOS-y, firmware-y, crypto-y security stuff?
I might be if you’d tell me what it is.
Excellent! You’ll fit right in. Here are our requirements:
Strong computer programming skills, in any language. You don’t need to be an expert C programmer, but be forewarned, you may be one by the time you leave.
Enrollment in a computer science curriculum.
Strong written English skills.
Ability to work consistent on-site core hours in either Chicago (we’re in the Loop) or Manhattan (we’re downtown).
Do you have any more details?
I do!
This is a salaried position.
Interships run between 10-12 weeks.
Office space and computers (we’re a Mac shop) provided.
How do I apply?
Email us at careers@matasano.com.
23 commentsCoverage: Don’t Believe The Hype
More and more I hear people discussing coverage in terms of security testing. I am here to give you some bad news. You will rarely get a genuine answer on how much Coverage you actually received. It is dependent on approach, methodology, tools and skill set.
51% percent of Wikipedia editors agree, the most common forms of coverage testing are:
- Function coverage - Has each function in the program been executed?
- Statement coverage - Has each line of the source code been executed?
- Condition coverage - Has each evaluation point (such as a true/false decision) been executed?
- Path coverage - Has every possible route through a given part of the code been executed?
- Entry/exit coverage - Has every possible call and return of the function been executed?
- Input Coverage - Has every input (e.g. form field, packet fields) been tested?
- Vuln. Class Coverage - Has every form of vulnerability been tested?
- Threat Based Coverage - Has every threat evaluated?
- What is an acceptable level of coverage in a security test?
- And if you happen to own security somewhere, what would it take for you to actually find a coverage % credible? I am going to guess the M word will rear its ugly head.
- Do you ever have anything that you can even come close to measuring? There are so many states inside of real world applications, even pen test specific forms of coverage aren’t going to come close to being complete.
- If yes, can you effectively convey that to anyone in a way that will actually give them some level of assurance (the a-word of computer security)?
6 comments
Defense in Depth, Reconsidered: Is Information Security Anything Like War?
Despite repeated assertion, I am dubious about the standing of “defense in depth” as a core principle for security design. It is, for example, not one of Saltzer and Schroeder’s Principals for the Protection of Information . It does, however, feature prominently in the Common Criteria —- which should tell you something.
To help sort out the controversy, I enlisted the support of my colleagues, adding “thoughtful commentary on this post” to their quarterly MBOs. I got responses from:
Dave Goldsmith
Eric Monti
Max Caceres
Jeremy Rauch
“If it were me,” begins Dave, I’d define defense in depth so that it has some meaning.” “That’d be a first,” riffs Eric. Burned! Continues Dave, “as a core principal for security design, [depth] has always been a little odd. It’s the baker’s dozen donut of design principals. Do the other ones and you get this for free.”
Eric agrees. “I think that when your approach begins from the basis of ‘depth’, you run the risk of covering bad design problems with complexity instead of rooting them out.” But Eric also associates ‘depth’ with network security, not application security, cautioning that it may make sense to employ layered security when you don’t control the “critical design elements.” “It irks me when vendors talk about ‘defense in depth’,” he says, but “I generally take it as good sign when customers do.”
Well, as I’m thinking about it, “defense in depth” originates from military thinking. Here’s an interesting Google book search: “defense in depth” in books published between 1900 and 1980, before the term was hijacked by infosec.
In this sense, the strategy of defense in depth succeeds or fails in any computing setting only to the extent that the analogy to warfare applies to that setting. Despite its attractiveness, with clean mappings to “attackers” and “defenders”, computer security is more like a puzzle than combat. In particular, computer security challenges often lack the attributes of combat most applicable to “defense in depth”:
Attrition
Exploits are not worn down by being forced to overcome multiple defenses; the attack that hits home after an obstacle course is no weaker than that attack, unimpeded. Jeremy agrees: “Linking computer security to military strategy is a lot like using traditional military tactics against terrorism. It doesn’t work.”
“But I’ve seen experienced exploit developers get worn down and demoralized by multiple filters they need to overcome to exploit”, replies Max. “jumping off memory in the JVM to overcome DEP involves effort to bypass a deliberate security mechanism, while doing the Heap Feng Shui dance also involves effort: this time to deal with an allocation structure that’s dynamic.”
“I know when I audit code that I have seen places where a vulnerability would have existed if not for multiple defenses,” says Dave. “Unfortunately, I can’t recall a single one right now.” My point exactly. And you have to do better than that example: you have to find the one where no single well-designed defense could have worked.
“There are no winters on the Internet,” argues Jeremy. “For a hacker, there’s no drawn-out war. It’s either attacks of convenience, or a slog against a specific target, with no penalties for taking your time.”
Deterrence
The presence of a countermeasure rarely gives an attacker pause, because the failure costs the attacker little and success rewards greatly. “The price you pay for the time expended developing an incredibly complex exploit isn’t losing life or limb,” says Jeremy, “All it costs you is a normal sex life. And thats what porn is for!” No offense, Mark.
“There’s a saying,” Max notes. “You don’t need to be faster than the cheetah, only faster than the guy running next to you.” Not many people target OpenBSD or GRLinux when there are soft targets to strike instead.
“I’m not sure that’s a deterrant,” relies Dave. “The people who will stop looking are the people who can’t find things anyways.”
Delay
A typical attack —- and, in particular, most of the attacks that top enterprise threat models —- executes in milliseconds. The time costs of successive countermeasures delay the attack in sub-human timescales.
Max concedes the point. “I agree delaying an attack is meaningless. Delaying the research required to get there may be worth something, though.”
“Delaying for delay’s sake is silly,” says Dave, “but the purpose isn’t to delay. It’s to encourage the attacker to make a mistake. It’s like the old game Berzerk. Electrically charged walls. Evil robots. Time limits. All multiply the likelihood that the game will end. The more you remove, the easier the game would be. Even just sitting there as an attacker can get you caught. To maintain access to systems, you have to do something that is ultimately detectable (no matter how unlikely). Even just sitting there encourages the big boucing head of detection to roll by screaming Intruder Alert! Intruder Alert!’”
Reaction
Because they rarely buy meaningful amounts of time for the defender, countermeasures afford little opportunity for defenders to retaliate (for instance, by involving law enforcement). In fact, the grooming requirements of countermeasures often have the opposite effect, forcing defenders to chase shadows or scramble to update filters.
Max isn’t sure. “If in the process of figuring out the policies of your web app firewall I trigger 100 alerts, you may be paying more attention to me by the time I actually get the exploit right.”
Eric isn’t having any of that. “If we ever make computers HALF as smart and alert as one average armed soldier standing by a door, then defense in depth may have a chance. Until then my money is on evasion. Even with a super-smart human security expert sitting 24/7 behind an IDS today, we have no real hope of filtering and reacting in time to security events. I totally agree about the grooming requirements. I’ve seen these become obsessions that completely wash out more productive uses of time.”
Predictability
The constraints on real-world combatants are far stricter than those placed on computer attackers. Warfighters can work from a palette of reasonable assumptions, including the fact that enemies can’t teleport, stop time, shapeshift, or reverse the trajectories of bullets. Analogies can be made from each of these fantasy capabilities to real instances of real security attacks.
“That’s a stretch,” replies Max. “I do think computer attacks can be somewhat predictable, or at least, as unpredictable as an ace war pilot.”
Eric agrees, with reservations. “There are some predictable behaviors that can be useful to notice and incorporate into your defenses. But, the problem is their shelf life is so short. As soon as those defenses are known, they’re obsolete.”
“But you can do everything right in an application,” replies Jeremy, “and still find it being used for unintended outcomes.” But isn’t the infosec equivalent here “accepted risk”? “I suppose the concept of accepted risk is a lot like the military concept of acceptable collateral damage —- but at least in security, the stuff at risk isn’t 18 year old kids.”
30 commentsIntroduced: A resolution resolving the semantic quarrel over malloc checking.
This is all my fault.
Many moons ago, I wrote a blog post chiding developers for checking the return value of malloc, the C function that allocates chunks of memory for programs to work with. When malloc fails, it returns NULL. According to Hoyle, you’re meant to check for that value, because malloc can fail at absolutely any time (you are not the only program claiming memory).
I stand by that argument, and by most of the wording of that blog post. Now about that word “most”.
Dave LeBlanc and I go back, though he may not remember that. Last bubble, we were dev leads on competing products. We’ve taken different career paths, and, long story short, he’s technically now more of an authority on secure coding than I am. And I’m telling you this because LeBlanc’s response to my last post is —- faithful paraphrase —- “are you high?”
LeBlanc thinks you’d have to be not check malloc returns, because:
not checking will inevitably crash the program, and crashes are bad,
not checking leads to the bug class Dowd found, and
not checking leads to the bug class Dowd found.
LeBlanc is right. I am wrong. “Not checking” is bad. Let me make a very slight semantic adjustment, so that I might be inassailably correct (again).
Here are three (extremely contrived) code examples. The first, let’s call, “unchecked”. It simply doesn’t check the return of malloc.
#define hostile /**/
void *
_setup(unsigned hostile slot, unsigned hostile id) {
u_int32_t *slots = malloc(SLOTS_SIZE);
slots[slot] = id; // XXX write32 corruption
return slots;
}
The second, we’ll call “caller-checks”. As you can see, it does.
#define hostile /**/
void *
_setup(unsigned hostile slot, unsigned hostile id) {
u_int32_t *slots = NULL;
if((slots = malloc(SLOTS_SIZE)))
slots[slot] = id;
return slots;
}
Now the third, which looks suspiciously like the first, we’ll “callee-checks”.
#define hostile /**/
void *
_setup(unsigned hostile slot, unsigned hostile id) {
u_int32_t *slots = malloc(SLOTS_SIZE);
// NOT REACHED ON FAILURE
slots[slot] = id;
return slots;
}
What’s the difference between the first and the third? In the third, if malloc fails, it does not return NULL. It instead hands the program off to a recovery regime, which, by default, safely and immediately ends the program.
What’s the difference between caller-checks and callee-checks?
First, callee-checks is safer. You can’t screw it up. The worst you can do is write a program that will abruptly terminate. This is far better than the current worst-case scenario, in which manifestly common programmer errors allow Mark Dowd to upload malicious code into your program.
Second, callee-checks is cleaner. In the caller-checks case, not only does “setup” need to check, but so does “setup“‘s caller, and it’s caller’s caller, and it’s caller’s caller’s caller, all the way down to the place where your program inevitably gives up and terminates the program.
“But, Thomas”, you say, “not all programs do give up and abort. Some have policies for handling out-of-memory conditions”. And so they do. And in most cases, those policies are global, and can simply be substituted for the default behavior of exiting the program.
But I will grant you that in many cases, you have a genuinely useful recovery regime that is specific to one code-path —- say, an arena-style allocation regime for a particular user request —- and no global policy will help. So, I submit to you a fourth option, which I will not name, and which looks suspiciously like example (2):
#define hostile /**/
void *
_setup(unsigned hostile slot, unsigned hostile id) {
u_int32_t *slots = NULL;
if((slots = unsafe_malloc(SLOTS_SIZE)))
slots[slot] = id;
return slots;
}
Did you see the difference? It’s subtle. But it’s also easy to grep for and easy to check.
I am an advocate for checking malloc —- callee-checks style. It is simply harder to screw up, and, in the overwhelming majority of cases, which you can check for yourself by randomly sampling Google Code Search, it costs you nothing in terms of reliability of functionality. Stop caller-checking malloc.
To LeBlanc’s other point about C++ and constructors throwing exceptions, I refer him to Cargill, or back to our blog, noting that exceptions are themselves inherently dangerous. “When a language feature requires you to be that-language-feature-safe”, I believe I said, “you have a security problem”.
As for his specific example: you can’t blindly throw exceptions from a ctor. Even Meyers (MECv1, Item 9) catches that one. DON’T DATE ROBOTS!
55 commentsWhy Injectable Virtual Machines?
In hindsight, rather than write a post about injectable virtual machine specifications, I should have started off with the rationale behind the whole concept and explained what they are to provide context to the readers. In this post, when we speak of virtual machines, we are discussing bytecode virtual machines such as UCSD Pascal’s p-Code machine, or the Java Virtual Machine.
All what an exploit by itself does is open the door to allow attacks in the form of payloads. To do something useful, we need a payload which is a block of code that is injected and then does tasks for us. Sometimes an exploit is tightly coupled with the payload, but it is important to keep the two components distinct organizationally.
There are different classes of payloads akin to the classes of exploitable vulnerabilities. The oldest and most well known is the traditional shellcode. Shellcode is commonly written in machine code and many spawn a command shell to allow the attacker to interact with the operating system. However, they are static, inflexible, and targeted to one execution environment. Machine code needs to be written to the specific architecture of the victim. It can break with patches or other changes to the environment.
Another common payload is the syscall proxy. The attacker sends messages to the proxy to execute system calls. This is more flexible than the traditional shellcode as it allows the attacker to dynamically react to the situation in the target execution environment. A major weakness is that the driving logic is on the attacker side, and this can make it fragile. Examples of software that uses this technique include CORE IMPACT and Metasploit.
DLL Injection is another payload technique, and its advantage lies leveraging the existing program code and libraries in memory. This allows easy implementation of higher level features. Logic can be placed on the target side, rather than relying on a proxy. However, it is static and it is usually Windows specific.
Another payload type that I find very interesting are exploit compilers. This is typically an intelligent compiler with retargetable backends that are written in a high level language. A notable example of this is Dave Aitel’s CANVAS. It offers a very nice abstraction of lower level code, and is very flexible. However, capabilities are often fixed at compile-time.
This brings us to a payload type that I have been researching: injectable virtual machines, which are bytecode executing environments as a payload. The driving logic is in the bytecode which can be embedded in the payload, or transmitted remotely.
Typical advantages are:
- Compact. A well structured bytecode language is more compact than machine code. Once the cost in memory space is paid for the virtual machine, the actual program to be executed can be much smaller than equivalent machine code.
- Machine independent. A well written virtual machine can abstract enough that bytecode can execute regardless of the underlying architecture. There are some limitations here, such as the difference between syscall proxying on a Unix versus Windows system, but this can be abstracted by yet another layer.
- Dynamic. Because it is a virtual machine, ‘in flight missile repair’ can be conducted, changing the entire characteristic of the program environment. This is especially useful with one-shot exploits.
- Assimiliation. Due to the inherent flexibility of virtual machines, this payload type is free to incorporate other techniques such as those mentioned earlier. A syscall proxy can be implemented, and DLL injection can be used to provide the virtual machine with functionality.
Bytecode virtual machines have a long history that dates back past the more common modern ones such as Python or Java. By looking at the early examples that ran in very constrained computing environments, we can transfer what we learn to a similiar context.
This post should hopefully help provide more context for the readers to understand the raison d’etre behind injectable virtual machines and my research. As always, I welcome feedback and comments.
5 commentsDowd’s Flash Report: What Have We Learned?
How nasty is the Flash vulnerability Dowd found?
Combined with any DNS vulnerability or any high-profile cross-site scripting vulnerability, the weaponized version of this attack would probably clock in at tens of thousands of compromised browsers per minute.
Is this a new bug class?
Sort of. It depends on what you mean by the term “class”. For example: most researchers consider heap overflows a seperate bug class from stack overflows. In reality, though, the same underlying coding error causes both vulnerabilities: poorly bounded copies. On the other hand, epistimologically, integer overflows are a new bug class, because the underlying coding error is a type violation, which creates an unbounded copy.
See how I used the word “epistemologic” there? That means you don’t care about the difference. Wild writes from NULL pointers are probably their own bug class.
So this is like the heap overflow revelation in the late 90s? NULL pointers are exploitable now?
No. Learn everything you can from Dowd’s paper and NULL pointers still aren’t usually exploitable:
They need to be written to, not read from; lots of fuzzer advisories trace down to loads, not stores, from NULL.
The offset needs to be controlled by the attacker; most of the time, offsets are hardcoded (most offsets are structure references).
The wild write needs to happen before any pointer loads that will crash the program.
Is there a pattern worth looking for here? Absolutely. Look for things that can return NULL that have random-access indexing. Malloc is a perfect example.
Wait a minute. Didn’t you say people shouldn’t check malloc?
Yes. This bug is a perfect case in point for why I’m right.
Consider: it is not the case that the Flash runtime never checks for allocation failure. What happened is, the Flash developers have an allocation checking regime that defaults to unchecked, and requires them to audit every allocation.
The way it should work is, by default, when using the simplest, most common allocation calls (malloc, or, in Flash, mem_Alloc), the program should abort if malloc fails. Returning and catching NULL is inadequate.
“But we can’t just abort when any given malloc fails! What about user-specified sizes?” You don’t have to abort on every malloc. You just have to abort by default. When you know you’re taking a value from a user, or any other unsafe input, you should use “unsafeAlloc”, which is simply malloc. Then you audit your code for the 3 places in the whole project that use “unsafeAlloc” and make sure the checking regime works.
Doesn’t runtime security, like in Vista, solve most of these problems?
Maybe, maybe not. Obviously it didn’t here, because Flash turned runtime security off.
But look at the bigger picture. Runtime security measures like ASLR and cookies and W^X memory all address the “dumb exploit” pattern. The “dumb exploit” pattern is an artifact of hardcoded runtimes generated by C compilers. When your exploit is shotgunned in through a dumb runtime, you lose both predictability and control of the target program. That’s basically what runtime security is capitalizing on: your exploit doesn’t know where DLLs are based, and so it can’t return directly into them.
The problem is, the hardcoded runtimes are going away. The vast majority of code written going forward targets extremely complicated runtimes, like the bytecode VM in ActionScript. In a bytecode VM scenario, an exploit has much more flexibility:
There might be 10x as many places to overwrite that will compromise the target; for instance, the abstract syntax tree objects containing method tables.
Valuable information might be readily accessible from known relative offsets or, better still, from registers kept loaded with intepreter state.
Just as with ActionScript, the content buffer that vectors your exploit in might be executable in the target runtime, leaving you only with the problem of compromising the verifier.
Just as with ActionScript, there may be an extremely powerful executive running on top of the CPU, rather than just machine code instructions running directly on the CPU.
These are all ways that high-level languages make runtime security harder.
But high-level languages are supposed to be a huge security win!
They probably are. But remember, even in the most intricate schemes (and Javscript compiled to a bytecode VM that runs off the system stack qualifies), high-level languages are really just glue around low-level languages: the most interesting features in Python, Ruby, and Javascript are implemented natively.
So, you get two interesting phenomenon:
You need to audit the runtime to make sure that the C code that implements the core language isn’t vulnerable (this is why Perl was a bad bet in 1995, when everyone was saying that buffer overflows were C’s fault).
You need to audit all the native extensions (such as Quicktime for Java), bearing in mind that unlike a server or a client, the attack surface for a language extension is arbitrary callers with arbitrary arguments —- a much more painful place to be.
Has Mark Dowd simply outclassed us? Should we pack it up and quit?
Yes. But don’t feel bad about that. You’re a human being, and he’s a remorseless killing machine. Big Blue crushed Kasparov, and now he’s not the prime minister of Russia! At a certain point, you have to concede the field, moving on to games where human beings still have the advantage. Computers haven’t solved Go, for instance. For us researchers, I suggest we take advantage of Mark Dowd’s robotic inability to love, and take up the arts, such as watercolors or interpretive dance.
15 comments