Alan Smithee | May 7th, 2009 | Filed Under: Uncategorized
Risk Management in Information Technology
A Response to the Article “Why Silent Updates Boost Security”
http://www.techzoom.net/publications/silent-updates/ by Thomas Duebendorfer, and Stefan Frei
“We recommend any software vendor to seriously consider deploying silent updates as this benefits both the vendor and the user, especially for widely used attack-exposed applications like Web browsers and browser plug-ins”(Duebendorfer & Frei, 2009)
In the world of security we often seem to have two groups of thought, those that focus on risk management and those who focus on technical solutions. These two groups fight like cats and dogs, but both are necessary in order to have a more secure system. Articles like “Why Silent Updates Boost Security” fall under the technical-only mindset(dogs, because they are always dogging people to update stuff) as the idea that a vendor should silently push updates to their software. I have a more risk management mindset(panther, well its a type of cat) because I believe this does not work in an enterprise system.
This is another article in the long battle(between cats and dogs) on the role of patch control. Some people believe that all systems should be immediately patched to prevent any and all known security bugs and anything less is negligence. On the other end are people who believe that once a system is in place it should never be patched because that could affect the stability of the applications residing in the system. Neither of these extremes are correct and this is where risk management comes into play.
Consider an enterprise with hundreds of apps, maybe 30% of them are customized. If they patched all their systems, Windows, Linux, Firefox, or any other software they may have, without testing the patch, this could have unintended consequences that could break custom apps, or other apps residing on the system. While they will be more secure, they will lower user satisfaction and affect business output. While security is important, if the business does not function we all have to go look for other jobs, and thats not as enjoyable as it sounds.
Often I hear people state that security patches are just to fix holes and that shouldn’t have an effect on systems anyway. In theory this is true, however patches don’t usually only patch security holes, they often fix other things, or add features that may interfere with software residing on a system. With silent updates, an enterprise may not be aware of the patch until it is already in place. The last thing an IT person who has to do support will want is software that is changing without their consent, or at least their knowledge.
In some corporations I have worked, Information Technology departments will install a software update server where they can download patches and test them in a QA environment before rolling them to clients. While this delays the installation of a patch, it does allow the company to test their applications on it before rolling it out. Like any QA in IT, most companies do not want to spend time/money on this and either roll out patches haphazardly or do not roll them out at all. This is obviously poor security and shows either a lack of risk management, or a conscious decision to take a much higher level of risk. Either of these are not an excuse to push management of patches back onto the vendor. The systems belong to the company and as such are the company’s responsibility. And upper management is going to blame IT anyway, so we might as well attempt QA.
The article does discuss patching strategies, and points out that vendors are putting two types of patches out, emergency patches, and patches on regular schedule such as quarterly. This does not help IT departments unless vendors post the patch early allowing IT to test them in their environment. Also, if the patch does break an existing internal system, then IT has to have a way to prevent the patch until either a fix is in place, or the risks are weighed and the patch is blocked permanently. Silent updates performed by the vendor could cause some real damage if left unchecked.
Ultimately it comes down to risk management. Do we want vendors to roll updates silently, and control risk management for us, or should Information Security Officers step up and perform risk management themselves, in conjunction with upper management of course. While the article makes a good point on how to secure an application (browsers in this case). In the real world, most enterprise environments are too complex to allow vendors to perform silent updates. Obviously this is a panther’s perspective, any dogs want to argue?
Ruby for PenTesters: Inline PacketFu With An Embedded Ruby Interpreter
This post is part of an informal series discussing the Ruby programming language for various tasks related to penetration testing. Many of us at Matasano do a lot of our coding in Ruby. We like using it when we need to write our own tools (which we do frequently). If your interested in more of our Ruby tools then browse back through our blog, or read on!
A pentester has many options available to him/her when it comes to hijacking packets off the wire for modification. A lot of people take the proxy route, I prefer the transparent hijack method myself. Awhile ago I wrote a tool called Quefuzz, its a fuzzer written in C that uses libnetfilter_queue to fuzz packets transparently. Basically you run an IPtables command:
iptables -A INPUT -p tcp —dport 9191 -j QUEUE
This will queue all TCP traffic with a destination port of 9191 is queue’d up for QueFuzz to fuzz and send back on the wire. It works pretty well, but after all its a fuzzer written in C, and that can only be stretched so far.
When it comes to writing fuzzers I prefer a flexible language like Ruby. I wanted to keep my beloved QueFuzz around for awhile but I could not find any Ruby/NetFilter wrappers on the internet (yes one exists for libipq, but that’s out of date). I absolutely hate wrapping C libraries in Ruby, its often a giant mess and it takes forever. Wrapping libnetfilter_queue would take too long, I needed a better solution, one that could use my existing code and get me fuzzing packets in about 30 minutes. So I turned the tables and embedded the Ruby interpreter into my C code.
This is documented out there on the internet if your interested. Here is a really simple example of how to call a Ruby script from C and share a string between them:
And there we go, we created a string in C, passed it to our Ruby script, modified it with the script and printed it out in C using the STR2CSTR macro. There are of course more elegant ways of doing this, but this is a good place to start.
Where was I? Oh yah, so I had all this perfectly good C code sitting around and it was going to waste. I implemented something similar to the examples I outlined above with my libnetfilter C code and in about 30~ minutes I had a 4 line Ruby script modifying packet data passed to it by my C program. This is great, I was able to use my old code and add a dynamic language into the mix in no time at all. I call it QueRub.
Now of course I don’t need the fuzzing code I wrote in C (its awful anyway). We now have a choice to either do the packet processing in C, and hand the packet off to a different Ruby method depending on the packet contents, or implement the logic in Ruby. The Ruby script is the obvious choice, as it’s much faster to implement and does not require any compilation. QueRub does however recalculate IP/TCP/UDP checksums after Ruby passes back the packet. This too, is easy in Ruby but the C code was already in place so I kept it.
Here is a quick overview of how to run and use QueRub:
We take our example IPTables rule from earlier and queue all outbound TCP packets with a destination port of 9191.
iptables -A OUTPUT -p tcp —dport 9191 -j QUEUE
Run QueRub by telling it which script we want it to call and which method should be invoked when a packet is received. It will not check if your method exists, so make sure you don’t fat finger that.
$ cat simple-querub.rb def my_method puts"Got a packet!" puts$pkt.inspect puts"Packet Size: #{$pkt.size}" end
$ ./querub simple-querub.rb my_method
Using telnet, connect a remote instance of netcat on TCP port 9191. I sent the string “THIS IS A TEST!” and heres my QueRub terminal output:
Got a packet!
“E2000E\301\325@00@06z\313\177000001\177000001\254A#\347\301X\206\312\301,j37\200300201\
211\23500000101\b\n02UPh02U/\215THIS IS A TEST!\r\n”
Packet Size: 69
So lets take a real example of a protocol that base64 encodes its entire payload. Pretty simple, but entirely realistic. Maybe we want to inspect its payload and replace it with another. This should be trivial in Ruby. Our packet is already a string so all we need to do is decode the string, inspect it, make our modifications, reencode it and pass it back to netfilter.
First we will need a simple Ruby script to decode and modify our payload. Our example is really simple and does not check the packet type or read the appropriate packet headers.
require‘base64’
def simple if$pkt.size > 52
pkt_b = $pkt[52, $pkt.size] puts"Base64 Payload: %s" % Base64::decode64(pkt_b) $pkt = $pkt.gsub(/#{pkt_b}/, Base64::encode64("Hijacked by QueRub!")) end end
Creating a simple client to test this is easy as well:
require‘base64’ require‘net/telnet’
o = Base64::encode64("This is a test!")
c = Net::Telnet::new("Host" => "127.0.0.1", "Port" => 9191, "Telnetmode" => false)
c.cmd(o)
Now we run Querub:
$./querub simple.rb simple
Run a server:
$ nc -l -p 9191
Run the client:
$ ruby client.rb
In the QueRub terminal you should see the following when the packet is intercepted:
Note: The netfilter library passes us the entire packet from the IP header down, so be careful of the offset you start to fuzz at. In my example I cheated and grabbed the data using a hard coded offset of 52 (IP header + TCP header).
Ruby’s flexibility opens up all kinds of doors for us. With one ‘require’ statement its hooked into Eric’s blackbag which lets us do stuff like “$pkt.blit”. Which would take a packet from one connection and stuff it into another pre-existing session. In fact a lot of the rbkb tools are a good companion to QueRub when it comes to reversing and fuzzing protocols. Another no-brainer is Matasano’s Ruckus, which makes parsing and fuzzing packets really easy. Fuzzing is not the only useful thing we can do here. We can use this to manually reverse engineer network protocols with ease by dropping into Ruby’s interactive shell (IRB) when a specific packet is received, make a manual edit and send the packet on its way.
“But doesn’t this call a Ruby script each time a packet is received!?” - Yah it’s pretty inefficient, but its a fuzzer so don’t worry so much. If your looking for uber-fast 1gbps fuzzage then you should probably stay away from Ruby in general.
I should probably mention this will only work on Linux because of the netfilter dependency. But this is a good thing cause it brings the idea behind Jeremy’s PDB to another platform (PDB used BSD Divert sockets). Here is the full code and an example Ruby script for modifying packets. I will most likely create some neat tools by mashing QueRub with rbkb and ruckus. Keep an eye out for them. Enjoy.
This post is part of an informal series discussing the Ruby programming language for various tasks related to penetration testing. Many of us at Matasano do a lot of our coding in Ruby. We like using it when we need to write our own tools (which we do frequently) because it helps us rapidly get off the ground doing all kinds of bespoke things that would be more tedious and time-consuming (or even impossible) using off-the-shelf tools. Go ahead! Call us fanboys (and girls!!!). We can live with it. We want to share some practical tips based on our experiences in this realm.
Recently, Timur Duehr and I had the opportunity to look at a rather large third-party product based mostly on Java/J2EE. Timur had been focusing mostly on the web front ends, but had already identified a lot of Java enterprise components and asked me my thoughts about writing some custom security testing tools for them. We’d been using ‘jad’ (our java decompiler of choice) to decompile some of the classes we had access to and we were already sifting through Java code getting a sense of things. One of us (I honestly don’t remember who) piped up and said, “I wonder if we could use JRuby to hook into this stuff.”
In the past when I’ve found myself doing security testing on Java-based targets, I usually “go with the flow” for the target and may develop some of my custom tools in Java when I need to. But, I definitely don’t code in Java “just for fun”. This is because frankly, Java just really isn’t fun to code in. So, even though neither of us had really done much with JRuby before this point, Timur and I didn’t spend much time debating this, and we quickly decided to give JRuby a try and see how far it got us. The idea now seems like a no-brainer in retrospect. I’m happy to say that it was a great experience. I just wish I’d started doing this a lot earlier.
Here’s some of the basic rationale and a few short code examples to highlight things.
First and foremost: we have tons of ruby code laying around for common tasks that we were able to leverage using JRuby, most of which works just fine since it’s “native ruby” (or.. MRI, the Matz Ruby Interpreter) Avoiding the use of too many third-party C library extensions pays off big-time here. All together, I’ve written way too many incarnations of little utility function libraries in a variety of languages over the years. I’m done for a while… I don’t want to again if I can avoid it. ‘Nuff said.
But beyond that, JRuby really shines on its own in several ways. Aside from just MRI integration, it sports truly seamless Java integration. Here’s a simple “hello world” using ‘jirb’, the JRuby equivalent to the interactive ruby shell.
So we just created a ruby Time object containing the current time and saved it to a file using Java serialization methods very much the same way a pure Java program would be written. It’s interesting to note that we created a *Ruby* Time object, not a Java one. Somehow everything worked just fine. So, obviously now we want to restore it back right?
Notice at the end that the ‘restored_time’ object is actually of a Java type now (java.util.Date or Java::JavaUtil::Date to ruby’s namespace). What happened here? JRuby’s automatic type conversion took over as the object was serialized to the ‘saved_time.ser’ file. When we restored it, that’s how it came back to us in Ruby via the JVM.
Next, (and this is where things begin to get interesting for pen-testing) if you want to hook into a third party ‘jar’ or ‘class’ file and mess around with it, it’s very straightforward. JRuby loads classes automatically using the same classloader rules as Java. If you need to do so explicitely it’s as simple as ‘require “/path/to/whatever.jar”’ or ‘require “/path/to/SomeClass” (which pulls in SomeClass.class).
Getting back to pen-testing: The app Timur and I were looking at had several components using Java Remote Method Invocation (JRMI) as a communication protocol behind the scenes. This included “thick client” components using JRMI to talk back to central management servers for things like administration and system monitoring as well as some of the servers using it to communicate with each other between tiers. RMI actually uses Java Serialization to transfer objects across the network under the hood (see… I was going somewhere with that). RMI is basically CORBA for Java. RMI has been around for a while, too — it’s at least 10+ years old.
We run into JRMI frequently enough in both custom in-house developed apps as well as third-party Java/J2EE based enterprise products out there these days. RMI is yucky stuff from a security standpoint. Gotta be blunt here: whenever a technology empowers people to shovel business logic around the network without really thinking about security (because it’s totally abstracted from the process)… you get yucky stuff.
RMI is one of those things for which, from a pen-testing standpoint, very few tools exist. Frankly, RMI probably gets overlooked and glossed over a lot because of this misinterpreted “obscurity”. One security tool comes to mind that did anything with RMI. It’s been a while since I used Nessus, but I recall it pulling down RMI registry listings when it identified an RMI registry server. So to kick off, here’s a basic RMI port scanner written in JRuby that does basically the same thing I remember Nessus doing. It looks for RMI endpoints on remote ports, then attempts to list the registry for endpoint names if an RMI registry server is on that port.
Here’s the relevant bits of code that do the registry listing:
Lets try this out against a basic RMI example. Here is one I whipped up and expanded slightly from this Sun tutorial.
$ rmiregistry &
$ java HelloServer &
Hello Server is ready.
$ ./rmiscan.rb localhost 1099
** Found a possible RMI endpoint at //localhost:1099
** Found RMI Registry at: //localhost:1099
//localhost:1099/Hello
Listing endpoints is about as far as Nessus went, as I recall. “Whoopie” right? How about actually identifying the methods exported via those endpoints? … and then calling them? Note, I’m not knocking Nessus here, there’s reasons why it doesn’t go much further than this, as we’ll see later.
Ok so sure, you could probably code something up pretty quick in Java. This is what people have done historically when testing RMI-based applications. But then you’d be coding in Java…
… cut to …
… somewhere far away, a kitten named “Panda” tragically strangles herself to death while playing adorably with her ball of yarn …
… cut to …
You don’t realize it yet, at least not fully, but your “inner happy” has just died a little bit at this very same moment.
This is because you’re just 5 lines into writing your “quick proof of concept”… in Java… and you’ve stopped to ponder whether it would be better to refactor your “quick proof of concept” as a HelloScroobalizer factory before you compile and oh… which methods should be protected…
include Java
import java.rmi.Naming
#...
registry = Naming.lookup(regurl)
registry.list.each do |remote_name|
puts "RMI Interface Found: #{remote_name}"
begin
remote = registry.lookup(remote_name)
puts " #{remote}"
remote.java_class.declared_instance_methods.each do |meth|
puts " #{meth.to_s}"
end
rescue
puts " **ERROR** #{$!}"
end
puts
end
Running this tool, we get:
$ ./rmiquery.rb //127.0.0.1:1099
RMI Interface Found: Hello
**ERROR** java.rmi.UnmarshalException:
error unmarshalling return; nested exception is:
java.lang.ClassNotFoundException: Hello_Stub
(no security manager: RMI class loader disabled)
Oops… what just happened? Well, RMI uses Java serialization, remember? So this means that in order for RMI to do its deserializing thing on the client, we need our client’s JVM to be able to instantiate the relevant interface classes exported by the remote RMI server. Actually, what the error is really telling us is that it the JVM doesn’t have the interface class definition, and our RMI security manager (the one we didn’t specify in our project so the default took over) wouldn’t let the class loader dynamically look for one on the network. I’ll um… just… copy this over for now.
This isn’t JRuby’s doing, it’s the same for Java. RMI just works this way. We’re supposed to have, at a minimum, the class files with the object’s interface in order to interact with it fully on the remote RMI server. JRMI does support dynamic network class loading mechanisms over the network. But, in practice, most apps just include the necessary ‘.jar’ files somewhere in the local class loader library path. This is where pen-testers and bad guys will find them too. I like to think people do the simple thing because of some sense of automatic self-preservation kicking in when hearing things like “dynamic network class loading” and “RMI class-loading over HTTP uses a hard-coded URL”. But, in the end, the classes usually aren’t that hard to get either way and having access to the necessary interface classes is the only “security requirement” most JRMI apps enforce on their clients.
Since we’re on this tangent: some pretty interesting security research work was recently presented by Adam Boulton (formerly of Corsaire) on fully enumerating RMI interfaces and calling their methods starting with no class definitions at all. Despite some cloaked caveats in Adam’s presentation (“oh… and… there’s this apparently random 64-bit number you need to know, but don’t worry about that.”) his attack really does help drive home how insecure most JRMI implementations probably are. Unfortunately, it looks like the tools he wrote in Java will not be published any time soon. On that note, I’ll say that in Adam’s video, he used a pretty hands-on process in the later steps of his presentation. Not knocking his work, but using a dynamic scripting language like ruby for the same thing might raise some exciting possibilities for automation and even dynamic class generation. But I’ll leave this as an exercise for the reader for now.
So anyway, with the HelloInterface.class in hand, we run the same command again: (quick note, the Java class loader rules kick in now and load the file automatically from the current directory)
public final java.lang.String $Proxy1.say() throws java.rmi.RemoteException
public final int $Proxy1.getCacheSize() throws java.rmi.RemoteException
public final void $Proxy1.setCacheSize(int) throws java.rmi.RemoteException
public final int $Proxy1.hashCode()
public final boolean $Proxy1.equals(java.lang.Object)
public final java.lang.String $Proxy1.toString()
public final int $Proxy1.add(int,int) throws java.rmi.RemoteException
There we go. This time our tool produced prototypes for all the methods exported via the “//127.0.0.1:1099/Hello” endpoint. Now, finally, lets call some of them from jirb. It’s incredibly straightforward… I can see why people would want to use this JRMI stuff in their products:
So there you have it. Just a scratch on the surface of what you can do with JRuby, but not too shabby. We’ve identified Java-based remote procedure services, enumerated them, and interacted with them them by invoking remote methods, all without having to touch a line of actual Java code. If you’re like me, and too much Java makes you cranky, give JRuby a shot next time you run into something where you need to dig into Java technologies! You’ll probably spend less time coding and go home with more findings (to your kittens who miss you!)
Dave G. | January 20th, 2009 | Filed Under: Uncategorized
I just read Amrit’s latest post about Heartland Payment Systems, which if you haven’t heard is at the center of what may be one of the largest compromises of data EVER. Of course, this is computing, so pretty much assume that we will continually have larger and larger incidents (this is basically Amrit’s point).
Heartland has a website up for customers (which is smart, but could use some more information). You can expect this story to get a lot of traction over the next couple days/weeks. A lot of unanswered questions, like:
What widespread global cyber fraud operation? Who else is affected?
When did they know about the intrusion?
Did they really try to slide this under the radar by announcing today?
Did the compromise come through the payment processing system?
Did the intrusion happen before April 30th, 2008?
If the last two questions are answered with Yes’s, what does this mean for PCI certification. From the list of certified vendors:
— Image totally snipped (with permission) from Amrit.
Mike Tracy | January 19th, 2009 | Filed Under: Uncategorized
Gary McGraw wrote an article published on InformIT called The Top 11 Reasons Why Top 10 Lists Don’t Work. In his article he takes care to extoll the obvious virtues of these lists so he’s obviously of the opinion that they may have some merit. To be honest, I’m certain that we have similar opinions on the subject, but in the spirit of discourse, I present:
The Top 11 Reasons Why Top 10 Lists Rock
(McGraw’s original list item in italics)
Executives don’t care about technical bugs. Executives do care about bugs in general, specifically how they affect risk. It’s impossible to manage unknown risk. Executives already know which applications are the most critical to running their business and which assets are the most important to protect (at least I hope so). Having a Top N list of ways your applications could get hacked at the very least gives you a place to start asking questions. You don’t have to understand the technical nitty-gritty of each issue to ask your managers “What are we doing about X?” or “What do we need to do to keep Y from happening?”.
Too much focus on bugs. Fixing bugs also helps fix design flaws. Finding bugs in software exposes the underlying patterns that caused them. Any development organization worth talking to (or about) is already using some form of Root Cause Analysis when dealing with software defects. By enumerating the problems along with how they can be prevented, managers are better able to help their teams find solutions to underlying causes of security issues.
Vulnerability lists help auditors more than developers. A well written vulnerability list really can help developers. If we were to strip the SANS list of everything about vulnerabilities and only keep the sections where we talk about Preventions and Mitigations, we’d have a list that would be just as useful, but would get no coverage in the media.
One person’s top bug is another person’s yawner. Top bugs are at the top for a reason. I can’t fault Dr. McGraw’s logic here except that he misses the original intention. This list isn’t meant to predict the top defects you’re going to find within your development organization. If I am working with a team that doesn’t use an RDBMS for a backing store (unheard of you say!), SQL injection isn’t going to apply to them. This list (and I don’t want to speak for the SANS crew) should be taken ala-carte. If you aren’t using similar technologies to the ones that cause the issues in the list, then by all means feel free to ignore them. Don’t throw the proverbial baby out with the bathwater.
Using bug parade lists for training leads to awareness but does not educate. Examples of exploiting vulnerabilities to assist in understanding why each mitigation tactic is important. Arguments against FUD aside, if you can use real-world examples of why something is bad, it provides a compelling argument for getting it right.
Bug lists change with the prevailing technology winds. Bug lists change because we’re raising the bar. This may not be true now but as the prevention strategies outlined in lists like this continue to be evangelized, they eventually take hold. Even though the listed vulnerabilities are still being attacked, they aren’t as easily exploitable and so will probably drop down or off the next list.
Top ten lists mix levels. Taxonomies are better because there’s an example of one in my book. O.K., this is just snarky and uncalled for but I don’t have a book to cite and I do have a point to make here. The *best* sources of information are the ones that you can make work in your organization. Whether you’re using a Taxonomy presented by one set of genius security people or a List of vulnerabilities and their prevention strategies presented by another, the goal of both are the same. Which makes more sense for you? I agree that a bare list of vulnerabilities would not be wise to use as a starting point for anything except a Letterman bit. As I’ve said elsewhere (and I’m sure I’ll say again), the prevention strategies are where it’s at.
Automated tools can find bugs — let them. Automated tools can find bugs — but will they? Static analysis tools have their place and I am not going argue against them. There’s no substitute for human testing. Even for web applications tested with an automated tool, there are many types of vulnerabilities that won’t be caught. Getting developers on board with secure coding practices and giving testers the tools and knowledge they need to provide as much information about risk as possible is Nirvana.
Metrics built on top ten lists are misleading. Metrics are misleading if you make them misleading. Going into a discussion of security and testing metrics is beyond the scope of a simple blog post and I don’t fundamentally disagree with McGraw’s assertion. But seriously, what kind of metrics would we develop from this kind of list anyway? If you’re talking about after-action metrics like “Number of Times our Site Got XSSed”, you’ve already lost. If we try to use this information to develop more secure coding practices and get some security testing going on, you have a better chance of realizing some value.
When it comes to testing, security requirements are more important than vulnerability lists. Verbatim, agree with this sentence 100%. He goes on to say, “Security testing should be driven from requirements and abuse cases…” My question here is: How do you create an abuse case without understanding the underlying vulnerability? How do I create security requirements without having at least some understanding of how attackers are trying to get at my assets?
Ten is not enough. No, but if you use the information you’ve been given as part of a comprehensive security strategy (wow am I ever sticking my foot in it here), it’s a start. If you had a list of preventative measures for each of the 700 issues referred to and piped them through sort | uniq, I think you’d be surprised at what a workable idea you just came up with.
Both the SANS and OWASP lists declare the purpose of their publication is to educate developers and other stakeholders on how to eliminate the most commonly exploited vulnerabilities from their software. They both use available data (the MITRE CVE database) to rank vulnerabilities based on prevalence: how often do we see this happen? and consequence: what damage can this do if I get hit by it? along with other metrics that have lessening degrees of import. I agree, just the list would be pretty worthless. The best part of each of these lists, however, are the pains they go to providing excellent information on mitigation strategies and tactics. Use this information to whatever advantage you can. It is, after all, free.
stephen | October 1st, 2008 | Filed Under: Uncategorized
Hi!
I am Stephen A. Ridley. I recently started here at Matasano as a Senior Researcher (working out of the Manhattan office). I studied Physics, but for work I do software reversing, protocol replication, and exploit development. Before Matasano, I was at McAfee as a Senior Security Architect, in a small (5 person) R&D group learning from all-stars like Mark Dowd, John Viega, and David Coffey. Prior to McAfee I was at Aegis Research (which became ManTech Security and Mission Assurance) supporting the U.S. Defense and Intelligence communities doing reversing and vuln research. I got the opportunity to do all kinds of other neat stuff there, but mostly I got to be batboy for all the grand slammers on that team.
Here at Matasano, I again find myself fortunate enough to be on another phat team. I (probably like most of you) came up following groups like Teso, #ADM, and antisec.is while getting amused by groups like b4b0, ~el8, and gob bles. Also, like many folks in this industry, my motivation tends to wax and wane, limboing between states of limerence and ‘jaded disillusionment’. (If you remember, for a while folks thought it was all over after ~2001…but here we are.)
While the game is definitely different now, there is still some inspiring stuff being done. Most recently some of the public discoveries and techniques I found to be pretty re-inspiring in different ways were:
And Dowd’s AS3 (oldschool technique with newschool application level impact)
Work like this serves as reminders that there is still a lot of unexplored landscape out there with plenty of good work waiting to be done regardless of how bleak the future might sometimes look for cool bugs. I look forward to “settling in” to work on some of the neat projects we have lined up here at Matasano and hopefully posting a bit here on the blog.
Thomas Ptacek | July 21st, 2008 | Filed Under: Uncategorized
Earlier today, a security researcher posted their hypothesis regarding
Dan Kaminsky’s DNS finding. Shortly afterwards, when the story began
getting traction, a post appeared on our blog about that
hypothesis. It was posted in error. We regret that it ran. We removed
it from the blog as soon as we saw it. Unfortunately, it takes only
seconds for Internet publications to spread.
We dropped the ball here.
Since alerting the Internet earlier in July about the upcoming
announcement of his finding, Dan has consistently urged DNS operators
to patch their servers. We confirmed the severity of the problem then
and, by inadvertantly verifying another researcher’s results today,
reconfirm it today. This is a serious problem, it merits immediate
attention, and the extra attention it’s receiving today may increase
the threat. The Internet needs to patch this problem ASAP.
Dan told me about his finding personally, in order to help ensure
widespread patching before further details were announced at the
upcoming Black Hat conference. We chose to have a story locked and
loaded for that presentation, or for any other confirmed public
disclosure. On a personal level, I regret this as well.
Dan did phenomenal work on this research. It was impossible to talk to
him today and not know that he was sincere about coordinating a
graceful disclosure and fix for the problem. That I helped detract
from that work is painful both personally and professionally, and I
apologize to Dan for the way this played out.
An intern expects to be given simple projects, like coffee retrieval,
or “Hello, World.” So I’ve been sorely disappointed by Matasano. I
have been offered coffee retrieval services by senior engineers and my
latest project has been anything but “Hello, World.”
In fact, it’s been more like, “Hello, OS X. Tell me your secrets”.
This is the story of one trial-by-fire project handed to an intern
that turned out to be more complicated than anyone expected.
1.
It started with Thomas, innocently enough, handing me some debugger
code. It was both C and Ruby, and for Solaris and Win32. He said, “I
would like you to port this Win32 Ruby code to OS X.”
“Um, okay.”
At that point I’d just finished learning the basics of Ruby via my
previous Matasano project, a database backed HTTP proxy. I knew
nothing about debuggers, let alone the low level C library calls I’d
need and Ruby bindings to make them work. I know, fun, right?
I started simply and dusted the C off in my head so I could begin to
read and understand the code Thomas dumped on me, and perhaps learn
how a debugger works and gets used. It took a day or two just to read
it. I’d ask the office some fairly basic question about debuggers, and
receive in return a much longer response than I’d anticipated. Like a
tutorial on the workings of x86 assembly. Eventually, I got to a point
where I was almost comfortable with how the C debugger worked.
When staring at C code stopped doing me any good, and writing Ruby
code started seeming feasible, I moved on to porting the Ruby
code. “How hard could it be?”.
2.
Thomas gave me a starting point. Our Ruby code called directly into C
libraries using Win32API and Ruby/DL. We have wrapper libraries that
make those C calls look like Ruby library functions. So, for instance,
in our Wrap32 library, we have:
# just grab some local memory def malloc(sz)
r = CALLS["msvcrt!malloc:L=L"].call(sz) raise WinX.new(:malloc)if r == 0 return r end
We had a small piece of this written for OS X as well. I had to build
it out. I started with getpid(), a simple system call I could make
sure worked before I moved on to something harder. It worked right
away. My confidence was high. I was feeling cocky.
Here I should mention that I’d never worked on a decently large coding
project before. This was my first.
Throughout this entire project I’ve been trying to write the entire
thing far before I actually write even a single function. So,
I had many questions:
What was the script implementing the debugger to look like?
Was it to be event driven?
Did we want objects to represent each process, threads, or to
make his lunch for him?
I was overzealous. The team was patient. Thomas said simply, “There is
no spoon. You’ll need ptrace() and wait() for the breakpoint
insertion and signal catching. Just copy the functionality from the
Win32 version.”
3.
An brief word from the team about how debuggers work.
The thing you most want to do with a debugger is set and handle
breakpoints. On X86, there are two kinds of breakpoints: hardware and
software. You mostly use software breakpoints. They way software
breakpoints work is, you pick the place in the program you want to
break at, and you replace the instruction at that point with “INT
3″ (conveniently enough, this is just the byte “0xCC”). When the
program hits the INT instruction, it generates an interrupt. The OS
catches the interrupt and kills the program.
Unless you have a debugger attached. If you have a debugger attached,
instead of killing the program, the OS tells the debugger. The
debugger then swaps the original instruction back in, “rewinds” the
prograam back to it, and resumes execution.
Every OS has debugging features. They boil down to the following
four capabilities:
Reading and writing the memory of another process (that’s
how you swap INT in for instructions to set breakpoints).
Catching events from other processes, like breakpoint
interrupts.
Starting, stopping, and pausing threads inside other
processes.
Changing the register state in other processes, for
instance by moving the EIP register back 1 byte to rewind
the INT 3 instruction that just fired.
The best known Unix debugger interface is ptrace(), and it basically
does all four of those things for you, along with the wait() call
for detecting events. On Win32, any program can read or write from a
process it has the right permissions for, even if it isn’t a debugger;
the debugger mostly exists to catch interrupts.
4.
Coding the wrappers for ptrace(), wait(), and waitpid() didn’t
take too long. Each just takes a few integers and returns an
integer. But ptrace works with request codes, like “PEEK” to read
memory or “STEP” to single-step the process. I couldn’t test without
knowin all the request codes. So, I started reading man pages, poking
at code and trying to get my OS X functions to work.
“To the headers!” I cried. But which one and where are they? As I
mentioned, I’m a little new to real — as in non-academic —
programming. Google worked OK to get the man pages, but didn’t
include the request code numeric values, just the names and what they
did. Frustrated, I asked for help.
“find /usr/include | xargs grep ptrace | less” was the response I
got from Thomas. You didn’t know he speaks *nix? He does. Hexadecimal too,
from what I’ve heard.
A little reading and some copying later I had the constants I needed,
and began to test my ptrace and wait functions. The code wasn’t
pretty but it seemed to work. I could attach to a process by PID and
wait() for it. Now I just needed to get its registers and I’d be
almost done.
It didn’t take long to sketch my code based on the Win32 debugger I
was given to start with. Soon I had what I thought was the start of a
functional debugger in Ruby, along with a handy explanation of the
Ruby way of doing things. Up until that point I’d been trying to do
things the C way, passing variables by reference, trying to make the
Ruby function call an exact match to the C call, and other things I’d
picked up from the C/C++/JAVA I learned in college.
I thought I was doing well. Then I tried to find the OSX equivalent of
PTRACE_GETREGS to read the registers from other processes, which is
kind of important for debuggers.
5.
Here everything starts to get more complicated.
It turns out Apple, in their infinite wisdom, had gutted
ptrace(). The OS X man page lists the following request codes:
PT_ATTACH — to pick a process to debug
PT_DENY_ATTACH — so processes can stop themselves from being debugged
PT_TRACE_ME — so debuggers can launch processes that start debugged
PT_CONTINUE — to restart a program after it’s been stopped
PT_STEP — to execute just one instruction in the process
PT_KILL — to kill the process
PT_DETACH — to release the process
No mention of reading or writing memory or registers. Which would have
been discouraging if the man page had not also mentioned PT_GETREGS,
PT_SETREGS, PT_GETFPREGS, and PT_SETFPREGS in the error codes
section. So, I checked ptrace.h. There I found:
PT_READ_I — to read instruction words
PT_READ_D — to read data words
PT_READ_U — to read U area data if you’re old enough to remember
what the U area is
PT_WRITE_I — and write instructions
PT_WRITE_D — and data
PT_WRITE_U — and U
PT_SIGEXC — and EXC SIGs
PT_THUPDATE — and update THs
PT_ATTACHEXC — and attach EXCs
There’s one problem solved. I can read and write memory for
breakpoints. But I still can’t get access to registers, and I need to
be able to mess with EIP.
That’s when I start hearing “It has to work, otherwise gdb
wouldn’t”, rather frequently, from more than one person.
Well, ptrace() won’t work for retrieving registers in OS X.
Matasano Secret Intern X referred me to Nemo’s article at
uninformed.org. In it, Nemo lays out the Mach kernel calls that
replace some of the lost ptrace() functionality. So, I wrote
wrappers for:
task_for_pid — to find the Mach task of an OS X process
mach_task_self — to get my debugger’s task
task_threads — to walk the threads inside a task
thread_get_state — to get the registers for one of those threads
thread_set_state — to change those registers
Since I wasn’t using them natively in C I needed to know more about
the usage of each function.
“No problem,” I thought, “I’ll just fire up terminal and… Oh, bloit!” No man pages.
I pored over Nemo’s work, what I could find in the headers, and
figured out how to call the functions. Now another problem. The Mach
functions take pointers to raw C memory.
The way I was told to handle this was, pack the data I needed into
Ruby strings or native numeric types with Ruby/DL. After a long, dark
period of messing with calls to “strdup” and “DL.malloc”, I found
“String#to_ptr”, and at last managed to get the Mach functions
working.
I had also found the correct way to get errno through Ruby/DL:
DL.last_error. This appears to be documented nowhere in English.
Except for an odd bus error I ran into now and then (but couldn’t
duplicate), my Ruby debugger was working and could read and write
registers. I’d even checked to make sure they were coming back to me
in the correct sequence.
Then, running my get_registers() function repeatedly, I found the
registers of a stopped process changing on every call. When I printed
them without marshalling they contained the names of some of the
functions I’d written occasionally.
“Oh, bloit! I’m really chakked now. I’ve been calling a bloitting buffer overflow a register lookup,” I
said to myself. I despaired of my project and my future.
6.
On the train home and all weekend I looked through Apple’s
documentation. Google. The header files “It has to work; Otherwise gdb
wouldn’t,” another friend said. But he wasn’t able to find the
documentation I was looking for. He did find fxr.watson.org and some
better explanations of the functions at
web.mit.edu/darwin/src/modules/xnu/osfmk/man/. Those turned out to be
gold later.
During week one of coding:
several necessary functions wrapped and working
DL.txt is really the only Ruby/DL documentation that exists
Ruby/DL is great for simple C function wrapping but rough around the edges when it comes to more interesting calls.
Avergage familiarity with Ruby
Basic understanding of how a debugger works
A Ruby object that can attach to a process, continue it, detach from it and wait() for it.
One really convoluted method to read/write random locations in memory
Average familiarity with system calls in C (now rust free)
7.
Starting the following week, things went a little smoother.
I had my coding flow going. I had better documentation than just
header files. I started reading the Mach kernel code.
I wrote a small program in C to test the sequence of system calls I
was using in Ruby. If It worked in C, why didn’t it work in Ruby?
Then, I found it. I was calling task_threads() wrong, passing an
pointer where it expected a pointer-to-pointer. Whee! I
vetted the results with gdb’s output.
eax 0xc0003786435
ecx 0xbffff74c-1073744052
edx 0×90e441ba-1864089158
ebx 0×32390205712
esp 0xbffff74c0xbffff74c
ebp 0xbffff7680xbffff768
esi 0×00
edi 0×00
eip 0×90e441b50×90e441b5
eflags 0×286646
cs 0×77
ss 0×1f31
ds 0×1f31
es 0×1f31
fs 0×00
gs 0×3755
They agreed! I went home for the day.
8.
Now for wait(), to catch debugger events. wait() was hanging the
debugger if I called it more than once. I set it up to use the
NOHANG option. I fixed an return value error.
Then, I tested single-stepping with ptrace. Kernel panic.
I put that on the list of broken parts of ptrace to be replaced by a
Mach call.
Next up was setting breakpoints. They seemed to install themselves
without error but the child wasn’t stopping when ran the command that
would hit the breakpoint I’d set. Upon inspection, the breakpoint was
replacing an instruction of -1. Which gdb told me was actually
0x55.
I started researching the problem, finding only hints. Did I mention
ptrace was gutted in OS X? I read the source for Apple’s version of
gdb. Thomas gave me a copy of a DTrace truss and said, “Just do
whatever gdb does.”
It took me a while to get the script working. It seems iTunes causes
errors in truss (also dtruss) whenever it’s running. I closed
iTunes and started using watching gdb for ptrace calls. Rather
quickly I noticed an extreme lack of call to ptrace.
Was gdb even using ptrace for reading the process’ memory?
It became apparent ptrace was only really used by gdb to:
prevent the process from exiting on signals
passing signals to the child after it processed them.
I then remembered that uninformed.org article. A quick read reminded
me that Mach vm_read and vm_write were needed to replace PT_READ
and PT_WRITE.
The next day, Thomas was in the office to check on my progress. To
move things along he implemented vm_read and vm_write for me while
I confirmed a few things with truss and looked for vm_read calls
in gdb. I didn’t find any. When he finished the functions, I used them
in my breakpoint setting routines. No errors.
No stopping at breakpoints either.
Again the instructions were -1. When I mentioned this Thomas
informed me I’d probably need vm_protect as well. Why hadn’t I
thought of that? Not too long after that I was able to set and remove
breakpoints correctly! I went home for the long weekend.
During week two of coding:
wrapped and implemented all necessary system calls
added thread state and breakpoint manipulation to Debuggerx
gained some knowledge of OS X internals
found a repeatable kernel panic
learned basic usage of dtrace and gdb
learned I tend to overthink my code before writing it
began to use irb as a scratch pad for testing functions
9.
Now another problem. You can set a breakpoint with the debugger. You
can catch the breakpoint. You can resume the process. But you can’t
reset the breakpoint without single stepping: to resume the process,
you have to clear the breakpoint.
But PT_STEP was panicking the kernel!
I settled on setting the TRAP flag in the EFLAGS register to simulate
single-stepping with ptrace. This seemed to work. But now I’m getting
bus errors when I resume the process. I verified with Thomas how they
were supposed to work. I tried watching gdb for vm_write from
truss again, nothing. After some debugging I discovered waitpid()
was clearing the trap flag, which Thomas informed me was correct
behavior. Some more monkeying around trying to get it working ate up
the rest of the day.
The next day, I was able to pass through a breakpoint and reset
it. Only problem was, the breakpoint wasn’t being reset fast enough, it
wasn’t done immediately one step after it was hit. After clearing some
confusion on my part with Thomas, I decided to try PT_STEP again. It
worked and didn’t panic the kernel this time. Finally, I had a
debugging tool that was complete!
All that remained was to clean up some debug tracing prints and
implement a better method to view the registers. Both fairly simple
things completed early the next day.
10.
There it is, the story of the birth of DebuggerX. A “simple” porting
task handed to an intern to better his understanding of debuggers and
Ruby. During the project I’d become quite familiar with Ruby, learned
some OS X internals, found a kernel panic in ptrace, and learned
better programming technics. I still tend to overthink my code and
“have a hard time believing that you’re supposed to ask programs to do
the things it looks like they need to do,” according to Thomas, but I
have learned it’s quite a bit easier to try something in code than in
your head. Since completion of the project as originally stated, I’ve
added calls to get information about a thread and began looking into
retrieving a list of function symbols from the process’ file. I’ll
make another post about that in the future.
Matasano is a team of internationally respected security experts who have led security efforts at @stake, Microsoft, ISS, Secure Computing, Arbor Networks, Secure Networks, Bloomberg, Sandia Labs, and others. Read more about our team and how we can help you today.