Archive for August, 2007
Thomas Ptacek | August 27th, 2007 | Filed Under: Defenses, Uncategorized
Rich Mogull, reacting to our virtualization work:
[R]eading up on Nate and Tom’s work I can’t see any techniques for
detecting an unapproved hypervisor in an already virtualized
environment.
This is a misconception. Defenders will not have a hard time detecting
unauthorized hypervisors, even when the defenders are already running
VMware or Microsoft Virtual Server.
Here’s why: the defender is embedded in VMware or Microsoft Virtual
Server.
Sound crazy? It isn’t. To detect kernel malware, you (typically)
already need to be running in-kernel; in other words, you have to be
part of the operating system. For the most part, to detect
virtualization, you have to be in-kernel as well.
Both Blue Pill and Samsara need access to the hardware. The trick is,
Samsara works even when Blue Pill is actively trying to “cheat” it,
making it believe it’s talking to the hardware when it’s talking to a
Blue Pill facade instead.
Joanna contests this argument. “Microsoft and VMware would never embed
detection hacks into their hypervisors!” We agree. Microsoft and
VMware are unlikely to ever need to. Hypervisor rootkits are not a
major threat. But if they ever become one, hypervisor rootkit authors
will find themselves a sitting duck for detectors. Joanna had the
right idea, “chickening out” into the OS kernel to hide from
Samsara.
15 Comments
Thomas Ptacek | August 27th, 2007 | Filed Under: Gatherings, Uncategorized
Come see Eric Monti reprise our Black Hat talk on Extrusion Detection
and Content Management and Filtering systems. Next Wednesday,
September 5th, at Chicago OWASP. From the abstract:
Some “Extrusion Detections” products rely on network gateway IPS/IDS
approaches, whereas others work in a way more closely resembling
host-based IDS/IPS. The main difference is that instead of
detecting/preventing malicious information from entering a company’s
perimeter, they focus on keeping assets inside.
We’ve been evaluating a number of products in this space and have
run across a large number of vulnerabilities. They range from
improper evidence handling, to inherent design issues, all the way
to complete compromise of an enterprise, using the Extrusion
Detection framework itself as the vehicle.
Capsule summary: Eric and I got a chance to test several
market-leading “Extrusion Detectors”. None of them emerged
unscathed. Eric will talk about the techniques and methods we used to
pick these black-box systems apart, and what types of vulnerabilities
we found.
Chicago OWASP is open to all comers, but you do need to RSVP to Jason
Witty (jason at wittys dot com) sometime before next Tuesday. Meetings
are held in the LaSalle Bank building on Madison. Check the OWASP
page for more details. See you there!
3 Comments
Thomas Ptacek | August 27th, 2007 | Filed Under: Gatherings, Uncategorized
Nate Lawson reports on BaySec, “I counted nearly 45 people at one
point, a new record. I think we might be even bigger than Chicago.”
Well, let me tell you something, Nate. BaySec will never be bigger
than Chisec. We will count people twice. We will count dead people. We
will pay people to show up. We will hurt people who don’t show
up. We’ll burn their houses to the ground. Their families. DEAD. You
want to the truth? You can’t handle the truth! Because when you look
over at a pile of goo that used to be your best friend’s face, you’ll
know what I’m saying! Forget it, Nate! It’s Chinatown!
Oh, also, Chisec is this week on Thursday. Same time, same place. No
RSVP required.
3 Comments
Thomas Ptacek | August 25th, 2007 | Filed Under: Defenses, Development, Uncategorized
Daniel J. Bernstein visualizes ciphers by rendering their DAGs —- an
intermediate representation, as would be used by a compiler as a step
towards generating object code.
(If you’re not familiar with the concept, a DAG is just a tree where a
node can have more than one parent. Single inheritance: tree. Multiple
inheritance: DAG. If you are familiar with the concept, I apologize
for saying “a DAG is just a tree…”).
How cool is this? For starters: I must have this poster. Here’s a
snippet of MD5:

And of SHA-256:

More substantively, from the (short) paper:
The right half of the SHA-1 graph is the SHA-1 message expansion,
and the right half of the SHA-256 graph is the more complex SHA-256
message expansion; for comparison, the MD5 graph has many long edges,
allowing an attacker to effortlessly pierce deep into the heart of the
MD5 computation.
Using visualization tools to find vulnerabilities has been in vogue
among the RCE crowd for years now. It’s the whole idea behind Halvar’s
excellent BinNavi tool. Now here’s an example of how cryptographers
have been using the same idea.
[…] I certainly can’t claim that the tools have saved time in
cryptanalysis. But I think that the tools will save time in
cryptanalysis, automating several tedious tasks that today are
normally done by hand.
Worth watching. Funny quote:
[M]y initial experiments with bit DAGs for MD5 have crashed every
standard drawing tool that I’ve tried. My own drawing tools are much
more careful in their use of memory.
Ok, maybe it’s only funny if you’ve tried to do large layouts in
Graphviz before, or wrote your own crappy graph layout code.
3 Comments
Thomas Ptacek | August 22nd, 2007 | Filed Under: Defenses, Uncategorized
Common problem in software protection:
You have a subprogram that checks to make sure no debugger
is attached to your code.
Attackers patch the subprogram, or its callers, to avoid
the check.
You verify the subprogram and its callsites with a hash.
Attackers patch the verification code, or its callers,
to avoid the check.
Repeat ad infinitum.
Nate Lawson has two cool “mesh design” answers for this problem.
The first is “hash and decrypt”. A “check for a condition” is a
boolean function. Patching boolean functions is trivial (usually just
a one-byte change). Instead of emitting a boolean, make your “check”
compute a condition-dependent value, and use it as a key. A sketch:
Instead of a true-false function, write a function that
computes a value that depends on the condition. For instance,
an antidebugging function might use log-scaled instruction
cycle timings.
Collect these measurements and hash them.
Use the hash as an AES key.
Use the key to decrypt and execute the “success” path
through your program. If the “check” for the condition
failed, the key will be wrong, and the success path won’t
execute.
What’s cool about this is that hashes are an inherently strong way of
aggregating lots of different measurements (the integrity of program
text, stored state, and the results of explicit measurements), and the
unlock-and-execute pattern means no simple patch can detour the
programs self-checks.
Nate’s next idea, just posted in his blog, thus invalidating his
future patent claims, is to use error correcting codes to do the same
thing.
Instead of verifying program text by hashing and generating a key,
Nate proposes to use Turbo codes to generate parity blocks for
security-sensitive functions. Instead of simply calling these
functions in the text, programs copy them, apply the parity to correct
any “errors” (read: unauthorized patches), and execute.
Clever! Read the post. As always, Nate’s posts are required reading.
11 Comments
Thomas Ptacek | August 22nd, 2007 | Filed Under: Disclosure, New Findings
RSnake discovers that Google gadgets can be coerced into
rendering arbitrary Javascript tags, and reports it to Google.
Google responds, in effect, “that’s one of the reasons why they
live under gmodules.com”, continuing, “If you do find a way of
executing this code from the context of a google.com domain,
though, please let us know.”
RSnake decries a “slap in the face”, saying “Google needs to figure
out what XSS is used for”. Attackers can set up phishing
sites on “Google-branded domains”.
Um, RSnake.
Phishers can create “Google branding” on any site they
want. That “Google branding” is just a GIF. Pretty sure
Javascript anywhere can put it on the screen.
That same “vulnerability” appears to be present on Blogspot.
Presumably, on Typepad, Livejournal, and Wordpress as well.
Last time I saw a Bank of America phish, the logo wasn’t
drawn in Windows Paint. I didn’t blame Bank of America.
Likewise, I’m not sure it’s hard for phishers to find or
create infected domain names that start with the letter ‘g’.
What, exactly, do you want them to do? I like your stuff and all that,
and if it’ll make you happy, I’ll bribe Dave to give you a Pwnie for
“best Google hack” over this —- but, am I wrong about this? Or are you?
19 Comments
Thomas Ptacek | August 22nd, 2007 | Filed Under: Bitching About Protocols
Larry Seltzer, writing in his column:
Of course, as Aitchison says, even that third party is accessed
through domain names. If I’m spoofing DNS, what’s to stop me from
spoofing verisign.com and putting up my own fake certificate
authority?
Of course, what stops you from erecting your own CA is that you can’t
recover the prime factors of very large numbers (or pretend to), or if you can, you
have better things to do with your time than set up phishing
sites. Like winning a Fields medal.
In other words: your browser shipped with Verisign’s key. You never
trusted the (insecure) DNS to get it. Neither did you trust:
the (insecure) DHCP protocol, which told you your router’s
address, or
the (insecure) ARP protocol, which told you how to reach
that address, or
the (insecure) OSPF routing protocol, which told your ISP’s
network where to send the packets you handed your router, or
the (insecure) BGP interdomain routing protocol, which, but
for a carefully-crafted regex string, allowed any network in the
world to claim the destination address of those packets.
Because of any of these failed, you’d be in the same pickle, DNSSEC or
not.
… it’s like, I don’t even care if the DNS is secure.
8 Comments
Thomas Ptacek | August 21st, 2007 | Filed Under: Bitching About Protocols
.
I’m Ron Aitchison, and you’re wrong about about DNSSEC.
Thanks for sharing. But what makes you so sure?
.
Because I’m “the President of Zytrax”.
Do they have SSL in Zytrax?
.
You “have to get to the right place, the right IP address” for SSL to
work; insecure DNS undercuts SSL.
No, it doesn’t. SSL doesn’t depend on DNS for security. That would
have been lunacy: DNS is insecure. So instead, your browser shipped
with Verisign’s key. You can’t spoof a certificate because you
can’t break RSA.
.
But I can make my own certificates “in the real name of
respectedfinancialinstitution.com but sign it myself using a plausible
looking name”, and you’ll just ignore the error. After all, Google’s
SSL certs never work.
DNSSEC has exactly the same problem. Both DNSSEC and SSL provide the
same signal: “the authenticity of this server cannot be
verified”. Just because the signal comes from DNS, doesn’t make
it any easier for Firefox to render it.
Oh, what’s that you say? DNSSEC solves the problem? Oh, that’s right:
when DNSSEC signatures don’t validate, domain names don’t resolve! I
get it. The major advance that DNSSEC provides is removing the
“continue” button from the SSL certificate warning dialog in my
browser.
A modest proposal: can we just disable that button, and forget about
DNSSEC? I don’t know who’s going to explain it to the users,
though. One-two-three not it!
Oh, sorry. You meant, “I can get a legitimate, Verisign-signed
certificate with an authentic-sounding name, and fool users with the
name alone”. Allow me to retort: “goto DNSSEC has exactly the same
problem”.
.
It is “stunningly naive” to assume sites without SSL are
unimportant. I can plant stories in the New York Times! With “grainy,
on-the-spot, sense-of-realism” footage, and cause mass panic!
Wow. That is a cool story. Let me see if I can outdo you. I’m
cheating, though: compared to yours, my story is plausible.
First, construct a time machine, and send the Internet back 93 years.
A young Gavrilo Princip, enraged at Austro-Hungarian hegemony over
Serbia, is just about to let off some steam with a scathing DailyKos
post. But just as he pushes the “Submit” button, a DNSSEC RR signature
expires, and Gavrilo gets a “host not found” error. Enraged, he storms
from his house to a deli, where he comes upon Archduke Franz
Ferdinand and puts a bullet through his jugular, goading Austria to
issue a series of untenable demands to Serbia. Entangling alliances
and the DNSSEC-prompted delivery failures of conciliatory diplomatic
emails drag a whole continent into a bloody, pointless war of
attrition.
So you see, for me, no matter what the cost, avoiding DNSSEC is vital
to the future integrity of the Internet.
.
Yes, “One of the underlying principles of security is that more code =
more errors and security holes.” But no it isn’t. “Bugs are removed
and the world moves forward.”
It’s either “one of the underlying principles”, or it isn’t. Google:
“hand waving”.
.
In fact, DNSSEC will be “be relatively more bug-free” than SSL,
because DNSSEC servers can use OpenSSL’s libraries.
Now Google “asymptote”.
.
DNSSEC servers “would do a trivial amount more work”, and are only
“marginally more vulnerable to DoS attacks”.
Now Google “chargen Case Against DNSSEC”. Then hit control-F, search
for “let’s not even start talking about the guys who run COM”. I
didn’t make that up. I also didn’t make up the proposal to turn DNS
zones into Unix password files, or the proposal to construct fake
covering RR sets to foil zone transfers.
Shenanigans. Read the backlogs of
namedroppers, the deployment list, or the dnsops list. Performance is an issue.
.
When DNSSEC is “end-to-end”, which it must be, all the heavy lifting
will be done by end-user machines.
I propose a new drinking game: “imagine that you could wave a magic
wand and update the bare metal software of every mainstream
machine on the Internet, and then come up with something more valuable
to do with that power than DNSSEC. Whoever comes closest to the floor
value of DNSSEC, without going through it, wins.”
My entry: every host on the Internet gets a royalty-free copy of
Univers, in all its normal and oblique widths. Also we
eliminate Comic Sans. In fact, just get rid of Comic Sans. Still more valuable than
DNSSEC!
.
Yes, to make end-to-end DNSSEC work, “the current stub-resolvers
installed on most of the worlds computers would need to be
replaced”.
Why, that’s no harder than spontaneously converting the whole world to
IPv6!
.
“By the way, Verner was convinced”.
They have medication for that.
[PS]
When you deride the Google ops teams by saying they can’t keep their SSL certificates up-to-date, I’ll respectfully oblige you to cite sources, or again call “shenanigans!” on you.
6 Comments
Thomas Ptacek | August 20th, 2007 | Filed Under: Uncategorized
1.
A refresher.
Here’s an IP address:

Here’s the same IP address, in integer form, as it would appear in
an IP packet:

Got that? Then you’ll have no problem with protected memory, which is
how your machine can simultaneously run a web server and a web
browser.
Here’s an address in your browser’s memory, in integer form:

Your webserver on the same machine has data at the same address (it
points towards the bottom of the stack). How’s that work?
Well, here’s the same address, with the fields broken out:

Like the IP address, these representations mean the same thing. A
virtual address has 3 parts: a page directory index, a page table
index, and a page offset. The first two parts form a “virtual page
number (VPN)”. When a program in protected mode tries to access data in
memory, the memory management hardware uses the VPN to find the
unshared real honest physical memory to read from:

You’ll see variants of this diagram all over the place. Here’s how to
read mine, which is simplified:
Start with CR3, which is a register —- a value stored
directly in the CPU. CR3 stores the Page Directory Base
Address. Think of the Page Directory Base as the hardware’s
equivalent of your process’ PID.
Take the Page Directory Index from the virtual address you’re
trying to read from. Offset that many entries into the Page
Directory from CR3. That’s the Page Table for the address.
Take the Page Table Index from the address. Offset that many
entries from the Page Table you got from step 2. There’s the
physical memory page the address corresponds to.
Finally, take the Page Offset from the address, and step that
many bytes into the page you found from step 3.
Two different processes have two different CR3 values, and so have two
different page table hierarchies. So the same virtual address in your
web browser and web server points to two different values.

By the way, if Page Directory step seems complicated, consider: most
processes use a tiny fraction of the entire 4 gigabyte virtual address
space. Each page table describes 4 megs of that space. Without the
page directory, you’d need 1024 page tables, each with 1024
entries. With it, you only need page tables for address space you use,
plus 1 more for the page directory.
2.
All good, right?
Not kablamo!
See, there’s something called a memory hierarchy:

Your goal as a modern computer system is to stay as close to Oscar the
Register as possible. Your goal as a modern computer system is to stay
the hell away from Ernie the DRAM cell, as much as possible. Ernie is
slow. That’s what Cache Monster is for.
But the page tables are stored in DRAM. Uncached, page translations
double the number of times you hit DRAM. Ow. So of course address
translation is cached. The cache is called the TLB:

Address space use in normal programs is very predictable, with lots of
locality. Most address translations run out of the TLB. Your CPU’s TLB
design is important.
3.
Let’s see how to exploit the TLB cache to detect virtualization.
Consider your CPU in steady state. The TLB cache mirrors the page
tables (there are more page table entries than TLB entries, which are
reclaimed as needed; obviously, there are more physical pages than
PTEs).

Now, saturate the TLB. Allocate a big block of memory, which has the
side effect of filling in a bunch of PTEs. Allocate another page of
memory. Color the page with one value, and the big block of pages with
another. As above, the TLB and the page tables will reflect each
other, and the TLB has a fixed size, so if you grab enough memory, all
the TLB entries will point to that block.

Here’s the fun part: desync the TLB from the page hierarchy.
There’s no magic that synchronizes the X86 TLB with the page
tables. If a TLB entry and a PTE entry are in sync, and you modify the
PTE without updating the TLB, memory accesses will reflect the “stale”
TLB value, not the “current” PTE value.
When you desync the TLB and the page tables, there’s a few things you
do to sync them back up. You can write to CR3 (as if switching to
another process), which flushes the TLB. Or you can issue an “invlpg”
instruction to clear an individual page.
Or you can do neither, and instead deliberately wire all the PTEs for
the big block of memory you allocated to the dummy page:

Turn off interrupts and preemption (and in all other respects halt the
running OS kernel) and then do this.
Your CPU is now in an interesting state.
As long as you don’t. touch. anything. else, memory accesses to
the big block of memory will behave like they did before you desynced
the TLB (you’ll read values out of the big block of memory).
But breathe on the memory hierarchy the wrong way right now, and that
will change. An entry will get evicted from the TLB, and the next
access to the address that was cached in that TLB entry will get
translated out the PTE, which now points somewhere else.
So, the trick is simple:
Saturate the TLB.
Desync it.
“Do something”
See if a TLB entry was lost by reading from the block of memory,
from each page, and seeing if you get the PTE’s version or the
TLB’s.
For example:

A no-op instruction won’t change anything. Neither will zeroing a
register. But access a random address outside the big block, and that
address will need to get translated; it will miss the TLB, and the
resulting translation from the page table will get cached, evicting
one of our “big block” entries.
Now the bit about virtualization.
The CPUID instruction retrieves info about the CPU directly from the
chip. Issuing a CPUID instruction shouldn’t touch memory.
But on Intel chips, if you’re in a VT-x guest virtual machine, CPUID
causes a “VM exit” —- a trap to the hypervisor. The hypervisor has to
emulate the CPUID instruction to the guest machine on behalf of the
actual hardware. The hypervisor is just code, probably written in C,
just like the kernel. And at a minimum, it has to touch memory to
figure out what kind of trap it is handling.
In the original Intel VT-x implementation, a VM exit flushed the whole
TLB, just like a CR3 write does in a process switch. So that’s pretty
noticeable. On AMD SVM, where Blue Pill runs, there are ASIDs that tag
the TLB, so not every entry is flushed:

But the hypervisor still has to touch memory to figure out what kind
of trap it’s handling, which evicts a TLB entry. When control is
handed back to the VM, you’ll see this as an offset into the big block
of memory that reads the wrong value.
4.
A neat twist:
There’s a seperate TLB for instructions —- the ITLB —- and for data,
the DTLB. Instruction execution implies virtual memory reads, to fetch
the instructions.
Here’s a very, very short subroutine:
{ 0xb8, 0xff, 0x00, 0x00, 0x00, 0xc3 };
This is:
mov eax, 0xff
ret
Or, somewhat equivalently:
int return_FFh(void) { return(0xff); }
Change the 0xff byte to 1, and you’ve got “return_01h()”. And so on.
So repeat the TLB desynch trick, but instead of probing the cache with
reads, probe them with subroutine calls. Your big block of memory is
effectively filled with “returnFFh”; your test page is effectively
“return08h()”.
Now, when you do something that causes a VM exit, the hypervisor’s own
code execution evicts ITLB entries. Before the VM exit, every call
into the big block returns 0xFF. After it, one or more of them will
return 8 instead.
5.
Another neat twist:
As you write to DRAM, you fill lines in the data cache. A write to
memory doesn’t instantly update DRAM. Dirty locations modified by code
are written back to memory as cache lines are evicted.
But the X86 has an instruction, “invd”, which clears out the data
caches without necessarily flushing the cached values back to main
memory.
This suggests another variation of the “saturate-and-probe” trick:
Allocate a big block of memory. Color it 0xFF.
Saturate the cache, queuing up enough writes to fill it.
“Do something”.
Issue “invd” to clear out the cache.
Read the bock of memory and see if any of your queued-up
writes “leaked” to main memory.
Again, hypervisor memory accesses will evict cache entries; the side
effect here is that a memory write we expected to throw away will get
burned into memory.
Two caveats here:
I haven’t tested this. I don’t think Keith Adams has tested
it. The TLB desync trick turned out trickier than we expected
it to be when we wrote it (for instance, you have to iterate
over the pages backwards).
According to People Who Would Know, Intel hardware doesn’t
promise to honor “invd” —- writes could have leaked to main memory
even if you told the CPU to chuck them.
6.
Of course, you don’t need to be this tricky with the caches to detect
the footprints that a hypervisor leaves through it. Evicting entries
from a cache will influence the timings of subsequent instructions.
So even if “invd” doesn’t let you get your CPU to a state where
“cpuid” changes the result of a memory read, you can still monitor
cache timing, using any of the local timers, to detect unexpected
cache evictions. Cryptanalysts have done it to steal RSA and AES keys,
and they don’t even have the OS cooperating with them.
7.
A word about credit for these ideas: none of it should go to me. At
the same time as Peter Ferrie, a member of our team, published the
first paper mentioning the TLB attack, and a team led by Tal Garfinkel
was working on a paper independently documenting the same attack.
Tal’s research partner Keith Adams wrote a blog post, which
was the first public mention of the TLB desync technique, and almost
certainly the origin of the “invd” idea.
And apparently some guy at McAfee has had the TLB idea for over a
year, although it looks like he got so excited about it during
Joanna’s talk that he wrote it down on a napkin for the first
time. (Note to McAfee guy: your notes from the show are cool and all,
but I think the slides from our talk are easier to read)
7 Comments
Thomas Ptacek | August 15th, 2007 | Filed Under: Uncategorized
BaySec is the fastest growing CitySec meetup we know of. Unsurprising, given how many cool security people work in the Bay Area. If you're one of the cool kids, or want to meet them, or make fun of them, all you have to do is show up to BaySec next Monday. It's at O'Neil's Irish Pub. Nobody RSVPs, nobody sponsors, nobody gives a vendor spiel, but the Mozilla people might have bright orange "Scriptstrong" gel wrist bracelets if you ask --- tell them we sent you.
6 Comments