Do Enterprise Management Systems Dream Of Electric Sheep?

Open up a browser window with the slides here.

Introduction

I rehearsed this talk in Ann Arbor, Michigan with the DC734 people in June. I went way over the amount of time we allocated. When Dave and I sat down to polish up the talk before Black Hat, we decided the best thing to do with the material was to approximately double the amount of slides. At the end, we did a dry run (thanks to Nico for skipping a talk to help us) and determined that if we did every slide in exactly 43 seconds, we’d be able to finish this talk, as long as nobody had questions.

So what happens when we go to give the talk? The AV system goes down for 20 minutes. I do not think we handled the situation gracefully.

I’m sitting in my hotel room on Wednesday night nursing my blown out vocal chords, determined to recover the value I saw in this talk originally. Here are some notes for the slides. Hopefully the talk reads better than it sounded. Thanks to everyone in the audience at Black Hat for the feedback. You were great.

Our Talk

Slide 1

Do enterprise management systems dream of electric sheep?

There’s this joke security researchers tell each other. It’s a bit of a cult thing. You may never have heard it. Everyone tells it differently. Here’s how I do it.

Salesguy walks into the office of a security architect at this big Fortune 100 global-fi. Says, “we’re a security product —- we want you to sign off on us.”

Architect goes, “sorry, I don’t let products get deployed without an audit”.

But the guy’s like, “this product is amazing, and I really want you to hear about it.”

Architect sighs. “What do you got?”

“Well, we get all of our stuff installed on one machine, and get your IT guys to roll that into the default build for all your desktops and servers. Leave an agent running on all those machines.”

Architect makes a face, but salesguy keeps talking. “Then we get this server installed somewhere in the middle of the network. And we send a message to all those agents, tell them to connect back to the server for directions. And the agents read that message and go talk to the server and reconfigure themselves.”

“So each of the agents makes a connection to the server. And the server feeds them files to install on the machines. The Windows desktops get new Windows software. The Linux servers get kernel patches. The mainframes get new —- well, whatever those things run.”

Salesguy keeps going. “Later on, the server connects back. It talks this protocol we encrypted by XORing a repeating 32 bit number. The protocol was designed in 1991. Some of the machines have long hostnames, and that blows a buffer in the server and crashes the server process; that’s OK, the server just restarts and keeps going. One of the agents has a single-quote in its name. That breaks an SQL query in the server. Cryptic errors all over the place.”

Architect is cradling his head in his hands. But salesguy’s on a roll. “Some some smart ass out there stuck Javascript in his host name. Agent feeds that back to the server. An IT admin logs into the web interface for the server. All of the sudden his web browser goes nuts. Starts submitting weird commands to all the agents, through the server. Uploads new Windows DLLs everywhere. The servers have all become BitTorrent repeaters. The desktops are fighting some DDoS battle with an Agobot botnet somewhere in Romania. And the desktops are kicking Romania’s ass, because Agobot has nothing on the software we installed on your network. Then our sales engineer poops on the floor, and he goes and brings out this big horse, and then…”

Architect waves his hands. “Stop. Stop! What the hell do you call this thing?”

“Secure Enterprise Management! No, wait, The Aristocrats!”

Slide 2

Hi. You know who we are. Me and Dave gave this talk. Dino did tech support. Jeremy donated an AV cable. Window gave an early version of the internal security slides and helped us dry-run the talk.

Slide 3

For the past year we’ve been doing research on internal vulnerabilities. Our hypothesis: applications that tend to be deployed in front of firewalls have improved security dramatically over the last 10 years, while applications deployed behind firewalls have not.

The “in front of the firewall” applications, due to exposure, are marquee targets. People pay attention to vulnerabilities in them. This creates the dangerous perception that the security of all applications is improving, due to the decreasing rate that we see new web server vulnerabilities.

We’ve tested this hypothesis. We are not surprised by the results. Our talk is about one class of vulnerable internal apps. Let’s set the scene by talking about some of the others.

Slide 4

Not a fan of this slide; it’s Apple Keynote gloss and I failed to resist it. I am ashamed.

It makes an obvious point: there are way, way more different internal apps than there are perimeter applications. And the value of internal assets is higher than the value of external apps.

Slide 5

Project Chinashop

We’ve been looking at internal apps. What have we found?

  • SAN is internal network storage technology. It’s like a file server, except instead of exporting directories, it exports an entire disk, which you mount and format and fsck just as if it was a SCSI drive. iSCSI targets are protected with CHAP authentication. When we looked at it, we found we could bypass CHAP by setting a bit in the iSCSI header.

  • D-R (disaster recovery) is slang for backup. We looked at the market leading network storage DR protocol (a filesystem mirroring system). Remote kernel heap overflows.

  • We will say things that will make you stop laughing about CORBA in a bit.

  • Same with message-oriented middleware. Also, we will tell you what the heck message oriented middleware is.

  • iFCP is a fibre-channel analog of iSCSI. On the market leading iFCP appliance, we found a remote pre-auth vulnerability that lets us take over the appliance.

  • Agents are what we’re here to talk about.

  • We’re keeping statistics on how long it takes after sending details, packet traces, and code to vendors before they patch, so we can publish findings. Based on our research, our current estimate for the average amount of time this process takes is: infinity.

Slide 6

If you thought writing firewall rules for your perimeter was a pain; there are thousands of random protocols running on internal networks, between tens of thousands of pairs of hosts. And when any of those connections are disrupted, IT screams bloody murder.

Slide 7

Researchers. You guys have to be getting bored trying to find still more SQL injections in web applications, or novel ways to exploit cross-site scripting. We know it’s getting harder and harder to find good stuff on “exposed” applications.

Getting worried about your long-term prospects? Take heart. There’s a whole other class of applications. They get deployed behind firewalls. They’re internal apps, and nobody has looked at them at all. It’s 1993 back there.

And there are many times more of them than there are external applications.

Slide 8

Take storage for example.

The most popular storage applications are well-known security debacles. CIFS and NFS are two of the most popular attack vectors ever. And I suppose that in the tens of companies that use Apple file sharing, AFP is a huge problem as well.

Speaking of which: we have an Apple AFP vulnerability we’re publishing today. If you are in charge of the network security for a graphic design shop, you should patch immediately!

The new storage protocols aren’t any better from a security perspective. They’re basically radioactive: safely useable only with an insane amount of shielding and isolation.

And access control for storage, of any stripe, is insane. A single resource shared by many hosts at wildly different levels of privilege. Nobody ever gets this right.

Slide 9

People who think about backup security quickly realize that regardless of how well you secure your servers, or encrypt your traffic, or police the networks, at the end of the day all the data in the business winds up in the hands of backup systems.

We hadn’t thought about this. We didn’t realize it. We had to be told. But then we looked at the protocols themselves, and, well, read the slide.

Slide 10

I don’t know why we even have this slide. Dave?

Slide 11

Printers are storage. Usually for the files you care most about. How secure are printers? CIFS, Web, LP, FTP, SNMP. Nobody ever had problems with those protocols before. But you know what? The guys who are going to get these ridiculous protocols right? The printer developers. Uh huh.

There’s this other joke security researchers do. It’s more like a game. This one where, you pick some random technology that every thinks is hot, and then you play out what the worm that took advantage of it would do. Like, “VOIP worm!”. Or, “IM worm!”. “MySpace worm!”. “Wikipedia Worm!” (51% of all Windows desktops believe: you’re owned!).

Here’s mine. See, there’s this worm, and it targets the embedded network print server in high-end laser printers. It gets delivered over MySpace, so when someone in your company goes to read about System of a Down, it gets access to your internal network. It exploits a trivial stack overflow in one of the 50 protocols the printer speaks. You know how Slammer took down the whole internet using only 70,000 infected hosts? And like, most networks only had a couple SQL servers that were vulnerable? Well, how many network printers do you have? Yeah, this worm is pretty bad.

So the Internet is melting and meanwhile nobody can print, which basically means that 99% of all paper-based business process worldwide (check printing, contracts to sign, etc) grinds to a halt while everyone in IT rushes to CompUSA to fight for the last USB inkjet printers like deranged parents on Christmas Eve trying to get that last tickle-me-whatever doll and, oh. Wait. I guess my version of this joke isn’t funny. Sorry.

My favorite part of this worm? I didn’t think of it. But, it rootkits the printer. Rewrites some flash memory thingy. I don’t know, I didn’t write it. Anyways, from that point on, any time someone prints something that looks like a spreadsheet? It goes and messes with the numbers first. Real subtle like.

Slide 12

There is an upside to all of this. Well, if you’re a security researcher.

Slide 13

But not if you’re responsible for securing a large internal network. Sorry.

Slide 14

Sorry. That’s a lot of background material. Let’s get less hand wavy, and get on with the talk.

What we are here to tell you is, your IT staff has done you the favor of pre-installing botnets on your network. That’s ok! They’re good botnets! Go thank them, and read on.

Slide 15

Our talk is about vulnerabilities in agent-based enterprise management systems. We don’t have a good name for these products. We’re using the name we see vendors use most, which is “Distributed Systems Management” (DSM).

DSM is how a multi-billion dollar software vendor spells “Botnet”. The word “bot” is spelled “agent”.

Agents are installed everywhere. The standard build of your WinXP desktop probably has a bunch of agents installed. Servers have agents. Mainframes have agents. They make IT’s life a lot easier. Really. The IT people we’ve talked to about this stuff really like these applications.

Slide 16

Let’s put this in perspective.

  • Botnets have Command & Control (C&C) systems, that let one 15 year old control 150,000 infected machines. Usually, that’s IRC. It’s simple and it works.

    DSM systems have C&C systems. They are more sophisticated than IRC. They are what the kid who wrote SDBot tells his friends he’s going write next on AIM, or his MySpace page, or whatever.

  • Botnets aspire to amateurish encryption.

    DSM systems have already got there.

  • You always need a password to talk to a bot. I guess this is one of the ways bots are different than agents.

  • Bots are effectively open-source. Their C&C protocols are simple and easy to analyze. This is another way bots are different from agents.

  • You can filter a bot by scrubbing all the IRC traffic in your network. You can filter infected agents by scrubbing all the traffic in your network.

Bots and agents usually run with local administrator privileges.

Slide 17

I made $3Bn up, but the analysts make their numbers up too. I think I’m close, though.

This is not a list of the vendors we’ve tested. Ryan Russell at BigFix sat through our talk without throwing food at us and politely pointed out: we haven’t tested BigFix. Ryan reads our blog and is a smart guy. We apologize, Ryan.

Slide 18

We want you to remember these things about DSM products through the rest of the talk.

Slide 19

Perhaps a visual aid will help.

The nightmare scenario is this: that a smart attacker who might otherwise waste her time painstakingly breaking into tens or even hundreds of your servers, installing bot software and linking them up to some IRC server, instead reverses the protocols use by agent management systems. In return she gets a pre-installed botnet that nobody is looking at because it’s supposed to be there.

Could she actually do that?

Slide 20

Uh, yeah.

Slide 21

A picture.

The arrows are connections. Desktop agents connect to management servers. Management servers connect to desktop agents. None of the agent systems we looked at have the server connect to the admin’s web browser, but if you find one that does, I will not be shocked.

Theme of next batch of slides: each one of these arrows is an attack vector.

Slide 22

Round 1.

First, a word about disclosure. Matasano doesn’t publish vulnerabilities before patches are made available. We’re not claiming that this is the right way to do things; it may not be. It’s just simple and unambiguous.

We’ve found a lot of stuff and we want to tell you about it. But we haven’t released advisories for any of it yet. Those of you who think we should have waited for Black Hat 2015 to do this talk are excused from the rest of this presentation. We’re sorry.

The rest of you, read on. We’ve tried to make clear some of the places where we have imminent advisories at the top of the slides.

Anyways, Round 1.

Yes. There is a popular agent-based management product where unauthenticated clients can pretend to be servers and tell the agent to run a command.

Slide 23

A word about authentication. Many of these products offer a variety of different “levels” of authentication. Expect them to be set to be turned off. Security complicates the lives of IT staff. I’m not being sarcastic; it just does. On most internal networks, HR is the first and last line of defense.

Slide 24

A word about “clientside” attacks.

Until recently, attacks that involved a server compromising a client (like a browser hole) were considered less valuable than direct attacks on the server. Because you have to convince a client to actually talk to your server.

Do not make that mistake in the DSM theater. If you take over an agent, you are basically guaranteed that a server is going to contact you. Both sides —- request and response —- of these protocols are always in play.

We want a coin a new term for this scenario. “Captive Clientside” is what you’ve got when the attack involves a malicious response to an innocuous request, and that request is guaranteed to arrive.

Slide 25

So for example: when the server asks the agent what version of an operating system it’s running, and the agent says it’s running a version of Windows XP that takes 10250 characters to accurately describe, and some of that string winds up in the program counter register in the server? That’s Captive Clientside.

Slide 26

Let’s sidestep. What about the protocols themselves? How secure are they? They’re not actually sending cleartext passwords over the wire, are they?

Well, sometimes. But there are more fun examples. For instance, a challenge-response authentication protocol that produces a “logged in” cookie. And that cookie is 4 bytes long. And it starts at zero. And counts upwards from each login.

We found a remote heap overflow in one application, and the (triage, short term) fix was to “authenticate” the connection with a magic number. A 1 byte magic number. Want to guess which byte of the packet it was?

Slide 27

With enough modsecurity goop applied, I am actually comfortable running Wordpress on a Matasano server. Even if it is written in PHP.

If all some of these DSM UIs controlled was blog posts, and not agents on sensitive hosts, I still wouldn’t run them under any circumstances.

Make sure you lock down these web servers. Don’t let people connect to them at all if they aren’t in the right group. It’s not a great solution, but it’s a start.

Slide 28

Here’s an interesting attack.

The whole point of having some of these agents is so they can report events and status. The agents are the source of all the information in these systems.

A hypothetical example: what you set your operating system version to “(script)alert(‘helu’);(/script)”?

That information has to go somewhere. In systems with web front ends, getting that data onto a web page is kind of the whole point of having the information.

And these are particularly valuable browser sessions to capture; they’re IT admins logged in to the management interface that controls thousands of agents.

Slide 29

Many of these systems have GUI clients. When we see GUI clients, we rarely see sane protocols linking them to the server. If you were hoping to see plain ‘ol HTTP under the hood, you’re going to be disappointed.

GUI client/server apps are the stomping ground for RPC protocols. And we saw all kinds. For reasons passing understanding, many of these developers saw the need to implement their own entire RPC stacks.

Keep telling yourself: it’s hard enough (read: damn near impossible) to get one protocol right. If it’s a complicated protocol, and you’re not Dan Bernstein, it can take years.

These systems have many, many protocols.

Slide 30

Like CORBA. CORBA is the future, if you live in 1996.

Slide 31

CORBA is object-oriented RPC. That sounds complicated, but we can make it very simple for you to understand. CORBA maps so neatly to web applications that it takes a smart person —- someone, in fact, smarter than us —- to explain why CORBA even needs to exist.

For example:

  • Web apps have the HTTP protocol.

    CORBA has IIOP. Think of IIOP as a (horrible) binary HTTP.

  • A web app might use a POST request.

    CORBA would call that a “message”.

  • A web app is delivered by a web server, like Apache or WebSphere.

    A CORBA app is delivered by an ORB, which is how CORBA people spell “web server”.

  • Web apps are broken down into “pages”. Web servers route requests to pages.

    CORBA apps are broken down into “objects”. ORBs route requests to objects.

  • Web apps have URLs.

    CORBA people spell U-R-L I-O-R.

  • Web apps use DNS.

    CORBA apps use something else. Let’s come back to that.

  • You pass arguments (like method names) to web apps in URL arguments.

    IIOP has special syntax for this.

  • Web apps use cookies to keep persistent state.

    IIOP spells “cookie” “SvcContext”.

  • Web apps pass data from the browser in POST data.

    CORBA passes it in MessageBodies.

Why did I tell you all of this? Because most of you know how to test web apps, a la OWASP. Well, now you know how to test CORBA apps, too. Most of the same attacks (in particular: forced browsing, weak cookies, and SQL injection) work the same way in CORBA.

Slide 32

OMG stands for “Object Management Group”, which is the IETF for CORBA.

This diagram is intended to show how CORBA naming works.

Alice wants to talk to Bob, who will only talk to people who have authenticated themselves. Alice needs to find both Bob and the authenticator.

  1. Alice is configured with an IOR that tells her where the CosNaming service runs. She asks CosNaming, “Where’s the authenticator?”

  2. Alice gets an IOR to the authenticator, and exchanges some messages to authenticate herself. Maybe she gets a SvcContext (read: cookie) back to prove that.

  3. Alice repeats the CosNaming process to find Bob.

  4. Bob uses CosNaming to find the authenticator, and talks to the authenticator to validate Alice’s cookie.

Slide 33

CosNaming is dynamic, unauthenticated, and exports methods to rebind names. Yeah, I’m not kidding. This actually works.

  1. Mallory rebinds the authenticator’s name to point to herself.

  2. Alice looks up the authenticator.

  3. Alice winds up talking to Mallory, thinking Mallory is the authenticator.

  4. Mallory proxies the authentication request to the authenticator, saving the password along the way.

  5. Alice talks to Bob, oblivious to what just happened.

  6. Bob does the same.

Of course, Mallory could have rebound any other name as well. Any two CORBA objects that communicate using CORBA names can be MITM’d remotely.

Slide 34

If you can intercept the requests the GUI client is making, you can take shots at the code in the GUI. If you bust the GUI, you get the real management server, and thus all the agents.

Slide 35

Again, the data coming out of the agents is trusted. The whole system usually expects that data to be well-formed and predictable. The entire clientside attack surface of the GUI is in play here, because an attacker that compromises any 1 out of 10,000 agents can generate malformed data to exploit it.

Slide 36

Earlier on in the talk, we noted that where malware bots use IRC for Command and Control, agents often use sophisticated message-oriented middleware.

7 years ago, I did a startup with some friends build around multicast-style message-oriented middleware, for streaming media. I have a soft spot for this stuff. The ability to put a whole bunch of hosts in a group, and send them all a message all at once, pretty nifty.

But it’s also pretty complicated. Clients and servers have to register themselves. Maybe authenticate (probably not). Somewhere, the system has to keep track of who’s connected, where.

And remember, all of this stuff is designed to traverse your firewalls.

The code that implements these systems is linked into the agents. And it’s linked into the servers. And it runs on standalone messaging servers. And we found trivially exploitable vulnerabilities in all of it.

Slide 37

But let’s move away from stack overflows for a second. Here’s a general attack on a lot of these systems.

For some reason, these systems want to have “application layer” names; each agent gets its own nickname, which is usually just its hostname. Servers and agents find each other by nickname.

Well, this was a problem on IRC and it’s a problem here too. If you can crash any of the components and get the name unregistered, you can race them back on and hijack their names.

Slide 38

Does this slide look familiar? Like maybe a duplicate of Slide 26?

Oh yeah. Different protocol. New vulnerabilities.

Something I want to mention really quickly. When we write exploits for vulnerabilities, we want to implement as little of the application we’re testing as we can, because life’s short.

So, for instance, when we find a vulnerability in the protocol handler for the management GUI protocol in the server —- say, it uses a trivially guessable “cookie” to prove that you’re authenticated —- what we’ll do is, write a proxy for that protocol. It’ll pick out the login message, and substitute it with a series of malicious login attempts. When it succeeds, it’ll let the rest of the session proceed unmolested.

What we get from this approach is an exploit that takes the original Windows GUI client and gives it super powers, like the ability to log into the server without a password as any user.

Slide 39

And of course, all this stuff backs into an SQL database, like Oracle or Microsoft SQL Server. And you all know how secure those are in typical installations.

Remember now: these are databases where, if you can break them, you get control over every machine running an agent.

Slide 40

We found a system that has multiple Windows clients, running multiple protocols. Something to watch out for: special “one-time” setup clients, and “performance management” clients.

Another thing to watch out for: some of these systems install their own SNMP agents, in addition to everything else.

Slide 41

Our point with this talk is that we are worried that security teams don’t realize these applications exist. Not only do you need to know that they’re there, you need to treat them gingerly as the most sensitive applications on your network.

Do not bother locking down Windows Terminal Server if you’re not going to make sure your management agents are secure. Vulnerabilities in agents are way scarier than Remote Desktop vulnerabilities.

Slide 42

We didn’t have time to give these slides, and they presage a blog post I’m working on, so I’ll spin through them really quickly.

Slide 43

Ridiculous, annoying things we saw in protocols that sent binary requests over HTTP.

Of all these, the most irritating was an implementation of Base64 that used the standard character set, but in a different order. That was maddening; we knew our Base64 code worked just fine, so it took us forever to think to check that they might have gotten it wrong.

Slide 44

CORBA has per-message selectable endianness. The messages running in either direction can be big-endian or little-endian.

Slide 45

Most TCP protocols have a “record layer”, where either side of the connection sends application-layer “packets”. These packets are usually encoded with some kind of length word tacked onto the front.

Except sometimes they don’t. In some protocols, the application is just writing a C struct directly to the wire.

ASN.1/BER has length-dependent lengths. And you can’t just always use 32-bit lengths; the protocol breaks if you don’t encode “1” in the fewest possible bits. We love dealing with stuff like that.

CORBA has alignment-by-stream-offset. Wow. That’s just creatively bad. What it means is, you have to remember at all times how many bytes into the stream you are. If you want to write a 32 bit value, you have to do it at an byte offset that is an even multiple of 4. What that really means is, you can’t ever copy a portion of a CORBA message, because the offsets will be wrong. So it’s impossible to write a CORBA proxy.

Slide 46

We wrote a blog post about this. Go read it.

Slide 47

We wrote a blog post about this too.

Slide 48

And about shell-script fuzzing.

Slide 49

We rarely use sniffers like tcpdump or wireshark to work over protocols. What works much better is to configure the client or the server to talk directly to us, and then just proxy their traffic for them. We have a simple plugboard proxy that does this, logs traffic, and lets us inject stuff into the stream.

Slide 50

In conclusion: read our blog.

Who We Are

Matasano is a team of internationally respected security experts who have led security efforts at @stake, Microsoft, ISS, Secure Computing, Arbor Networks, Secure Networks, Bloomberg, Sandia Labs, and others. Read more about our team and how we can help you today.