The X86 Memory System And Why It’s Hard To Virtualize Securely
Thomas Ptacek | September 28th, 2007 | Filed Under: Defenses, Uncategorized
The X86 memory system causes surprising confusion. By way of analogy: take iTunes on Windows, to machines talking to the Internet from a corporate network. Here’s how memory breaks down in the analogy:
User virtual addresses: Pretend iTunes is a VMware guest image, running in bridged networking mode, assigned 10.0.1.8 (a private IP).
Kernel virtual addresses: Windows iTunes runs under the Windows NTOSKRNL kernel; in our analogy, this is VMWare’s host operating system. Like the bridged VMware guest, it’s on the internal net, with 10.0.1.4.
Physical addresses: NTOSKRNL works with the CPU’s MMU to map virtuals into physical addresses. In our analogy, the MMU is like a NAT firewall, translating 10.0.1.4 and 10.0.1.8 to external addresses, like 66.178.176.10.
IO Addresses: A firewall can’t get you onto the Internet without a router. An MMU can’t get you to memory or devices without a chipset. The “northbridge” in the chipset is like that router. A router figures out that when you use an external address like 66.178.176.10, you mean ESCAPE.COM, but when you say 69.61.87.163, you mean SKODA.SOCKPUPPET.ORG, which is on your own network. The chipset northbridge figures out that some physical addresses go to memory, and some go to devices.
The PCI Bus: is like the Internet. Nobody on the Internet can talk to you using your 10.0.1.8 address. 10-net addresses don’t route. PCI devices don’t use virtual addresses, like iTunes sees. They use physical addresses. The backbone routers PCI devices use to get to physical memory are, in X86 parlance, the “southbridge”.
Sound cards: are machines on the Internet. They want to talk to iTunes, which is on a private network. This works for PCI the same way it does for Skype on the real Internet. iTunes, via the kernel, somehow tells the sound card the extenal address of a block of memory it will read and write to. The sound card talks to the chipset to get access to that memory. That’s called DMA. It’s how pretty much all your peripherals work.
In pictures. The memory and DMA story:
Compared to the enterprise networking story:
Clear as mud? You’re welcome. Why is this important for security? Mostly because of virtualization.
In 5 years, nobody will deploy new servers for applications. Every application will run on a virtual machine. Inevitably. And as we approach this end-state, more and more of our applications are virtualized, and performance becomes a bigger concern.
One of the ways virtualization hurts performance is to come between the operating system and the hardware.
So we’d like some way of giving virtual machines direct access to slices of hardware. For example: a high-performance web server might want to own a NIC. The problem is, the web server lives alongside 10 other guest VMs that don’t trust it. The NIC’s DMA access forces them to.
X86 hypervisors do an elaborate dance with guest OS kernels, tricking them into believing they control the physical/virtual “NAT” table. That dance happens entirely within the CPU. On the Internet, Petko D. Petkov does not care about your 10.0.1.8 address. If your firewall will let him talk to you at 69.61.87.160, he can attack you. On your computer, a NIC does not care about your kernel and user virtual addresses. It’s using physical addresses, which address all the memory shared by all the VMs in the machine.
This is a hole in the X86 VMM security strategy. Today, if you want your guest VMs not to have to trust each other, and one of them needs direct access to NIC, you have to trust that the NIC can’t be coerced into copying network packets with executable code directly into the kernel memory of the other machines.
Nate Lawson is writing a series on the nuts and bolts of this. CPU vendors are proposing features, such as IOMMUs, that will regulate device access to memory in much the same manner as the CPU’s MMU regulates access between processes. These features may or may not work in the long run; in the short run, it looks like they provide show-stopping performance problems of their own.
You only need to read Nate’s posts if you (a) believe virtualization will be ubiquitous within the next 5 years, (b) work in security, and (c) believe you need to sound like you know what you’re talking about. Otherwise, Nate’s posts are entirely optional.




grey
September 28th, 2007 3:09 pmamd64’s already have iommu’s last I checked, this was one of the proposed potential remediations against things like Dornseif’s firewire/DMA pwn4g3. Though iirc, at the time amd64’s weren’t out, and there may still be other vectors without iommu’s (e.g. agp). That said, I’m not clear that iommu’s are being used for anything other than a performance boost in the VM world rather than a security measure.
Regardless, I’d still argue that owning a VM is generally just as good as owning the underlying hardware - unless you’re trying to display some leet realmode demo splashes with your ownage (I’ve noticed a lot of 64k demos don’t run under Vmware for me sadly).
Thomas Ptacek
September 28th, 2007 3:25 pmIOMMUs aren’t uncommon features, but if you read the papers that Nate links to, the performance impact of using them for fine-grained memory management (the simplest, most comprehensive way you’d think to use them to, say, manage DMA buffers for devices), it turns out to be a performance problem.
The issue with them is that if your memory manager has to demand-configure the IOMMU, setting up and tearing down entries regularly, the CPU load of managing those tables turns out to dominate the box.
Thomas Ptacek
September 28th, 2007 3:26 pmOh: regarding the impact of owning VMs versus real machines: think 5 years forward, when marketing shares VMMs with engineering; or, think today, when your ecommerce app shares a hypervisor with someone else’s.
securology
September 28th, 2007 4:19 pmCiting Thomas Ptacek at: http://securology.blogspot.com/2007/09/thomas-ptacek-on-dma-virtualization-and.html
Joseph Huang
September 29th, 2007 7:31 amExokernel Exokernel Exokernel Exokernel
Thomas Ptacek
September 29th, 2007 12:00 pmI’d flag that last comment as spam, but, I agree.
Chris_B
September 30th, 2007 11:03 pma bit off topic, but I haven’t heard the name escape.com in many years now; makes me wonder how Roman the old 2600 folks are getting by
Nate
October 3rd, 2007 7:44 pmOther old hostnames:
crimelab.com
bagpuss.demon.co.uk
trad.tacobell.com
The last one happens to loosely be a googlewhack.
ivan
October 4th, 2007 1:14 amand lets not forget…kremvax
Meanwhile, i think people reading this blog post should go read Intel’s VT-d specs. And introductory article is here
http://www.intel.com/technology/magazine/45nm/vtd-0507.htm?iid=techmag_0507+rhc_vtd
Mohit
October 11th, 2007 4:31 pmThomas,
You are right given the assumption you made:
- VMs don’t trust each other
- A VM wants direct access to hardware
I look at this as a limitation of virtualization technology and wonder if there is anything to gain from “addressing” it…no pun intended
Mohit.
Jim's personal blog
October 16th, 2007 10:09 pmx86 Memory system and virtualisation
The X86 Memory System And Why It’s Hard To Virtualize Securely
The first in a series of articles by Thomas Ptacek about how the x86 system memory architecture makes secure virtualisation difficult.
This reinforces the message I got some time ag…
Jim's personal blog
October 16th, 2007 10:10 pmx86 Memory system and virtualisation
The X86 Memory System And Why It’s Hard To Virtualize Securely
The first in a series of articles by Thomas Ptacek about how the x86 system memory architecture makes secure virtualisation difficult.
This reinforces the message I got some time ag…
Leave a reply