linux-kernel - Re: [RFC PATCH v1 0/9] Hypervisor-Enforced Kernel Integrity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf1d6831-dac9-f738-44b4-a9dbc575b7e9@quicinc.com>
Date:   Wed, 24 May 2023 17:37:14 -0700
From:   Trilok Soni <quic_tsoni@...cinc.com>
To:     "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
        "mic@...ikod.net" <mic@...ikod.net>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "hpa@...or.com" <hpa@...or.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "wanpengli@...cent.com" <wanpengli@...cent.com>,
        "vkuznets@...hat.com" <vkuznets@...hat.com>
CC:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "yuanyu@...gle.com" <yuanyu@...gle.com>,
        "jamorris@...ux.microsoft.com" <jamorris@...ux.microsoft.com>,
        "marian.c.rotariu@...il.com" <marian.c.rotariu@...il.com>,
        "Graf, Alexander" <graf@...zon.com>,
        "Andersen, John S" <john.s.andersen@...el.com>,
        "madvenka@...ux.microsoft.com" <madvenka@...ux.microsoft.com>,
        "liran.alon@...cle.com" <liran.alon@...cle.com>,
        "ssicleru@...defender.com" <ssicleru@...defender.com>,
        "tgopinath@...rosoft.com" <tgopinath@...rosoft.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "qemu-devel@...gnu.org" <qemu-devel@...gnu.org>,
        "linux-security-module@...r.kernel.org" 
        <linux-security-module@...r.kernel.org>,
        "will@...nel.org" <will@...nel.org>,
        "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
        "dev@...ts.cloudhypervisor.org" <dev@...ts.cloudhypervisor.org>,
        "mdontu@...defender.com" <mdontu@...defender.com>,
        "linux-hardening@...r.kernel.org" <linux-hardening@...r.kernel.org>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "virtualization@...ts.linux-foundation.org" 
        <virtualization@...ts.linux-foundation.org>,
        "nicu.citu@...oud.com" <nicu.citu@...oud.com>,
        "ztarkhani@...rosoft.com" <ztarkhani@...rosoft.com>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH v1 0/9] Hypervisor-Enforced Kernel Integrity

On 5/24/2023 3:20 PM, Edgecombe, Rick P wrote:
> On Fri, 2023-05-05 at 17:20 +0200, Mickaël Salaün wrote:
>> # How does it work?
>>
>> This implementation mainly leverages KVM capabilities to control the
>> Second
>> Layer Address Translation (or the Two Dimensional Paging e.g.,
>> Intel's EPT or
>> AMD's RVI/NPT) and Mode Based Execution Control (Intel's MBEC)
>> introduced with
>> the Kaby Lake (7th generation) architecture. This allows to set
>> permissions on
>> memory pages in a complementary way to the guest kernel's managed
>> memory
>> permissions. Once these permissions are set, they are locked and
>> there is no
>> way back.
>>
>> A first KVM_HC_LOCK_MEM_PAGE_RANGES hypercall enables the guest
>> kernel to lock
>> a set of its memory page ranges with either the HEKI_ATTR_MEM_NOWRITE
>> or the
>> HEKI_ATTR_MEM_EXEC attribute. The first one denies write access to a
>> specific
>> set of pages (allow-list approach), and the second only allows kernel
>> execution
>> for a set of pages (deny-list approach).
>>
>> The current implementation sets the whole kernel's .rodata (i.e., any
>> const or
>> __ro_after_init variables, which includes critical security data such
>> as LSM
>> parameters) and .text sections as non-writable, and the .text section
>> is the
>> only one where kernel execution is allowed. This is possible thanks
>> to the new
>> MBEC support also brough by this series (otherwise the vDSO would
>> have to be
>> executable). Thanks to this hardware support (VT-x, EPT and MBEC),
>> the
>> performance impact of such guest protection is negligible.
>>
>> The second KVM_HC_LOCK_CR_UPDATE hypercall enables guests to pin some
>> of its
>> CPU control register flags (e.g., X86_CR0_WP, X86_CR4_SMEP,
>> X86_CR4_SMAP),
>> which is another complementary hardening mechanism.
>>
>> Heki can be enabled with the heki=1 boot command argument.
>>
>>
> 
> Can the guest kernel ask the host VMM's emulated devices to DMA into
> the protected data? It should go through the host userspace mappings I
> think, which don't care about EPT permissions. Or did I miss where you
> are protecting that another way? There are a lot of easy ways to ask
> the host to write to guest memory that don't involve the EPT. You
> probably need to protect the host userspace mappings, and also the
> places in KVM that kmap a GPA provided by the guest.
> 
> [ snip ]
> 
>>
>> # Current limitations
>>
>> The main limitation of this patch series is the statically enforced
>> permissions. This is not an issue for kernels without module but this
>> needs to
>> be addressed.  Mechanisms that dynamically impact kernel executable
>> memory are
>> not handled for now (e.g., kernel modules, tracepoints, eBPF JIT),
>> and such
>> code will need to be authenticated.  Because the hypervisor is highly
>> privileged and critical to the security of all the VMs, we don't want
>> to
>> implement a code authentication mechanism in the hypervisor itself
>> but delegate
>> this verification to something much less privileged. We are thinking
>> of two
>> ways to solve this: implement this verification in the VMM or spawn a
>> dedicated
>> special VM (similar to Windows's VBS). There are pros on cons to each
>> approach:
>> complexity, verification code ownership (guest's or VMM's), access to
>> guest
>> memory (i.e., confidential computing).
> 
> The kernel often creates writable aliases in order to write to
> protected data (kernel text, etc). Some of this is done right as text
> is being first written out (alternatives for example), and some happens
> way later (jump labels, etc). So for verification, I wonder what stage
> you would be verifying? If you want to verify the end state, you would
> have to maintain knowledge in the verifier of all the touch-ups the
> kernel does. I think it would get very tricky.

Right and for the ARM (from what I know) is that Erratas can be applied
using the alternatives fwk when you hotplug in the CPU post boot.

---Trilok Soni