The X86 Memory System And Why It’s Hard To Virtualize Securely

Thomas Ptacek | September 28th, 2007 | Filed Under: Defenses, Uncategorized

The X86 memory system causes surprising confusion. By way of analogy: take iTunes on Windows, to machines talking to the Internet from a corporate network. Here’s how memory breaks down in the analogy:

  • User virtual addresses: Pretend iTunes is a VMware guest image, running in bridged networking mode, assigned 10.0.1.8 (a private IP).

  • Kernel virtual addresses: Windows iTunes runs under the Windows NTOSKRNL kernel; in our analogy, this is VMWare’s host operating system. Like the bridged VMware guest, it’s on the internal net, with 10.0.1.4.

  • Physical addresses: NTOSKRNL works with the CPU’s MMU to map virtuals into physical addresses. In our analogy, the MMU is like a NAT firewall, translating 10.0.1.4 and 10.0.1.8 to external addresses, like 66.178.176.10.

  • IO Addresses: A firewall can’t get you onto the Internet without a router. An MMU can’t get you to memory or devices without a chipset. The “northbridge” in the chipset is like that router. A router figures out that when you use an external address like 66.178.176.10, you mean ESCAPE.COM, but when you say 69.61.87.163, you mean SKODA.SOCKPUPPET.ORG, which is on your own network. The chipset northbridge figures out that some physical addresses go to memory, and some go to devices.

  • The PCI Bus: is like the Internet. Nobody on the Internet can talk to you using your 10.0.1.8 address. 10-net addresses don’t route. PCI devices don’t use virtual addresses, like iTunes sees. They use physical addresses. The backbone routers PCI devices use to get to physical memory are, in X86 parlance, the “southbridge”.

  • Sound cards: are machines on the Internet. They want to talk to iTunes, which is on a private network. This works for PCI the same way it does for Skype on the real Internet. iTunes, via the kernel, somehow tells the sound card the extenal address of a block of memory it will read and write to. The sound card talks to the chipset to get access to that memory. That’s called DMA. It’s how pretty much all your peripherals work.

In pictures. The memory and DMA story:

dma-small.png

Compared to the enterprise networking story:

enp-small.png

Clear as mud? You’re welcome. Why is this important for security? Mostly because of virtualization.

In 5 years, nobody will deploy new servers for applications. Every application will run on a virtual machine. Inevitably. And as we approach this end-state, more and more of our applications are virtualized, and performance becomes a bigger concern.

One of the ways virtualization hurts performance is to come between the operating system and the hardware.

So we’d like some way of giving virtual machines direct access to slices of hardware. For example: a high-performance web server might want to own a NIC. The problem is, the web server lives alongside 10 other guest VMs that don’t trust it. The NIC’s DMA access forces them to.

X86 hypervisors do an elaborate dance with guest OS kernels, tricking them into believing they control the physical/virtual “NAT” table. That dance happens entirely within the CPU. On the Internet, Petko D. Petkov does not care about your 10.0.1.8 address. If your firewall will let him talk to you at 69.61.87.160, he can attack you. On your computer, a NIC does not care about your kernel and user virtual addresses. It’s using physical addresses, which address all the memory shared by all the VMs in the machine.

This is a hole in the X86 VMM security strategy. Today, if you want your guest VMs not to have to trust each other, and one of them needs direct access to NIC, you have to trust that the NIC can’t be coerced into copying network packets with executable code directly into the kernel memory of the other machines.

Nate Lawson is writing a series on the nuts and bolts of this. CPU vendors are proposing features, such as IOMMUs, that will regulate device access to memory in much the same manner as the CPU’s MMU regulates access between processes. These features may or may not work in the long run; in the short run, it looks like they provide show-stopping performance problems of their own.

You only need to read Nate’s posts if you (a) believe virtualization will be ubiquitous within the next 5 years, (b) work in security, and (c) believe you need to sound like you know what you’re talking about. Otherwise, Nate’s posts are entirely optional.

12 Comments so far

  • grey

    September 28th, 2007 3:09 pm

    amd64’s already have iommu’s last I checked, this was one of the proposed potential remediations against things like Dornseif’s firewire/DMA pwn4g3. Though iirc, at the time amd64’s weren’t out, and there may still be other vectors without iommu’s (e.g. agp). That said, I’m not clear that iommu’s are being used for anything other than a performance boost in the VM world rather than a security measure.

    Regardless, I’d still argue that owning a VM is generally just as good as owning the underlying hardware - unless you’re trying to display some leet realmode demo splashes with your ownage (I’ve noticed a lot of 64k demos don’t run under Vmware for me sadly).

  • Thomas Ptacek

    September 28th, 2007 3:25 pm

    IOMMUs aren’t uncommon features, but if you read the papers that Nate links to, the performance impact of using them for fine-grained memory management (the simplest, most comprehensive way you’d think to use them to, say, manage DMA buffers for devices), it turns out to be a performance problem.

    The issue with them is that if your memory manager has to demand-configure the IOMMU, setting up and tearing down entries regularly, the CPU load of managing those tables turns out to dominate the box.

  • Thomas Ptacek

    September 28th, 2007 3:26 pm

    Oh: regarding the impact of owning VMs versus real machines: think 5 years forward, when marketing shares VMMs with engineering; or, think today, when your ecommerce app shares a hypervisor with someone else’s.

  • securology

    September 28th, 2007 4:19 pm
  • Joseph Huang

    September 29th, 2007 7:31 am

    Exokernel Exokernel Exokernel Exokernel

  • Thomas Ptacek

    September 29th, 2007 12:00 pm

    I’d flag that last comment as spam, but, I agree.

  • Chris_B

    September 30th, 2007 11:03 pm

    a bit off topic, but I haven’t heard the name escape.com in many years now; makes me wonder how Roman the old 2600 folks are getting by

  • Nate

    October 3rd, 2007 7:44 pm

    Other old hostnames:

    crimelab.com
    bagpuss.demon.co.uk
    trad.tacobell.com

    The last one happens to loosely be a googlewhack.

  • ivan

    October 4th, 2007 1:14 am

    and lets not forget…kremvax
    Meanwhile, i think people reading this blog post should go read Intel’s VT-d specs. And introductory article is here
    http://www.intel.com/technology/magazine/45nm/vtd-0507.htm?iid=techmag_0507+rhc_vtd

  • Mohit

    October 11th, 2007 4:31 pm

    Thomas,
    You are right given the assumption you made:
    - VMs don’t trust each other
    - A VM wants direct access to hardware

    I look at this as a limitation of virtualization technology and wonder if there is anything to gain from “addressing” it…no pun intended :)

    Mohit.

  • Jim's personal blog

    October 16th, 2007 10:09 pm

    x86 Memory system and virtualisation

    The X86 Memory System And Why It’s Hard To Virtualize Securely

    The first in a series of articles by Thomas Ptacek about how the x86 system memory architecture makes secure virtualisation difficult.

    This reinforces the message I got some time ag…

  • Jim's personal blog

    October 16th, 2007 10:10 pm

    x86 Memory system and virtualisation

    The X86 Memory System And Why It’s Hard To Virtualize Securely

    The first in a series of articles by Thomas Ptacek about how the x86 system memory architecture makes secure virtualisation difficult.

    This reinforces the message I got some time ag…

  • Leave a reply