Google Android - Unprotected MSRs in EL1 RKP Privilege Escalation









As part of Samsung KNOX, Samsung phones include a security hypervisor called RKP (Real-time Kernel Protection), running in EL2. This hypervisor is meant to ensure that the HLOS kernel running in EL1 remains protected from exploits and aims to prevent privilege escalation attacks by "shielding" certain data structures within the hypervisor.

In order to prevent EL0 and EL1 code from creating disallowed memory mappings (e.g., disabling PXN on areas not in the kernel text or enabling write permissions to a kernel code page), RKP employs a system through which all modifications to the S1 translation table are validated by the hypervisor. Moreover, RKP marks the EL0 and EL1 translation tables as read-only in the stage 2 translation table for EL0/1.

Normally, an adversary running in EL1 would be able to directly modify the value of TTBR0_EL1 and TTBR1_EL1, which would allow them to subvert the S1 protections. However, RKP correctly traps such MSRs in order to make sure any new translation table is also verified.

Specifically, when an MSR to a memory management control register is executed by EL1, it triggers a synchronous exception in the hypervisor. In the case of RKP, this exception is handled by the function "vmm_synchronous_handler". The function checks whether the abort is due to an MRS/MSR from EL1, and if so, calls the function "other_msr_system" in order to service the request.

As mentioned above - RKP does, in fact, verify the translation tables when set via TTBR0 and TTBR1 (by calling "rkp_l1pgt_ttbr"). However, for MSRs targeting the TCR_EL1 and SCLTR_EL1 registers, it directly modifies their value from EL2 without performing any validation.

These two registers are extremely sensitive and modifying their values allows an attacker to subvert the RKP memory protections.

In the case of TCR_EL1, the attacker can set TCR_EL1.TG0 or TCR_EL1.TG1 in order to signal that the translation granule for TTBR0 or TTBR1 (accordingly) is any value other than the default 4KB granule used by the Linux Kernel.

Modifying the translation granule allows an attacker to subvert the stage-1 memory mapping restrictions used by RKP. This is since RKP incorrectly assumes that the translation granule is 4KB without actually checking the value in TCR_EL1.TGx.

For example, when protecting the translation table in TTBRx_EL1, RKP only s2-protects a 4KB region - since when using the 4KB granule, the translation regime has a translation table size of 4KB. However, for a translation granule of 64KB, the translation regime has a translation table size of 64KB.

This means that the bottom 60KB of the translation table remain unprotected by RKP, allowing an attacker in EL1 to freely modify it in order to point to any wanted IPA, with any AP and PXN/UXN values.

In the case of SCTLR_EL1, the attacker can unset SCTLR_EL1.M in order to disable the stage 1 MMU for EL0 and EL1 translations. This would allow an attacker to trivially bypass the stage-1 protections (such as the ones discussed above), as no AP or XN permission checks would be present for stage 1 translations.

Lastly, it should be noted that while these MSRs might not be present in the kernel's code, they *are* present in RKP's code. As RKP's code pages are executable from EL1, an attacker can simply call these MSRs directly from RKP's code while running in EL1.

I've statically verified this issue on the RKP binary (version "RKP4.2_CL7572479") present in the open-source kernel package "SM-G935F_MM_Opensource".

Proof of concept for the RKP unprotected MSRs issue.

This PoC disables the M bit in SCTLR_EL1.

Proof of Concept: