[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e430efa3-6a7a-44c8-a1d2-9943c76f748e@linux.microsoft.com>
Date: Wed, 29 Nov 2023 13:47:07 -0600
From: "Madhavan T. Venkataraman" <madvenka@...ux.microsoft.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Mickaël Salaün <mic@...ikod.net>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H . Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Kees Cook <keescook@...omium.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Alexander Graf <graf@...zon.com>,
Chao Peng <chao.p.peng@...ux.intel.com>,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
Forrest Yuan Yu <yuanyu@...gle.com>,
James Gowans <jgowans@...zon.com>,
James Morris <jamorris@...ux.microsoft.com>,
John Andersen <john.s.andersen@...el.com>,
Marian Rotariu <marian.c.rotariu@...il.com>,
Mihai Donțu <mdontu@...defender.com>,
Nicușor Cîțu <nicu.citu@...oud.com>,
Thara Gopinath <tgopinath@...rosoft.com>,
Trilok Soni <quic_tsoni@...cinc.com>,
Wei Liu <wei.liu@...nel.org>, Will Deacon <will@...nel.org>,
Yu Zhang <yu.c.zhang@...ux.intel.com>,
Zahra Tarkhani <ztarkhani@...rosoft.com>,
Ștefan Șicleru <ssicleru@...defender.com>,
dev@...ts.cloudhypervisor.org, kvm@...r.kernel.org,
linux-hardening@...r.kernel.org, linux-hyperv@...r.kernel.org,
linux-kernel@...r.kernel.org,
linux-security-module@...r.kernel.org, qemu-devel@...gnu.org,
virtualization@...ts.linux-foundation.org, x86@...nel.org,
xen-devel@...ts.xenproject.org
Subject: Re: [RFC PATCH v2 18/19] heki: x86: Protect guest kernel memory using
the KVM hypervisor
On 11/27/23 14:03, Peter Zijlstra wrote:
> On Mon, Nov 27, 2023 at 11:05:23AM -0600, Madhavan T. Venkataraman wrote:
>> Apologies for the late reply. I was on vacation. Please see my response below:
>>
>> On 11/13/23 02:54, Peter Zijlstra wrote:
>>> On Sun, Nov 12, 2023 at 09:23:25PM -0500, Mickaël Salaün wrote:
>>>> From: Madhavan T. Venkataraman <madvenka@...ux.microsoft.com>
>>>>
>>>> Implement a hypervisor function, kvm_protect_memory() that calls the
>>>> KVM_HC_PROTECT_MEMORY hypercall to request the KVM hypervisor to
>>>> set specified permissions on a list of guest pages.
>>>>
>>>> Using the protect_memory() function, set proper EPT permissions for all
>>>> guest pages.
>>>>
>>>> Use the MEM_ATTR_IMMUTABLE property to protect the kernel static
>>>> sections and the boot-time read-only sections. This enables to make sure
>>>> a compromised guest will not be able to change its main physical memory
>>>> page permissions. However, this also disable any feature that may change
>>>> the kernel's text section (e.g., ftrace, Kprobes), but they can still be
>>>> used on kernel modules.
>>>>
>>>> Module loading/unloading, and eBPF JIT is allowed without restrictions
>>>> for now, but we'll need a way to authenticate these code changes to
>>>> really improve the guests' security. We plan to use module signatures,
>>>> but there is no solution yet to authenticate eBPF programs.
>>>>
>>>> Being able to use ftrace and Kprobes in a secure way is a challenge not
>>>> solved yet. We're looking for ideas to make this work.
>>>>
>>>> Likewise, the JUMP_LABEL feature cannot work because the kernel's text
>>>> section is read-only.
>>>
>>> What is the actual problem? As is the kernel text map is already RO and
>>> never changed.
>>
>> For the JUMP_LABEL optimization, the text needs to be patched at some point.
>> That patching requires a writable mapping of the text page at the time of
>> patching.
>>
>> In this Heki feature, we currently lock down the kernel text at the end of
>> kernel boot just before kicking off the init process. The lockdown is
>> implemented by setting the permissions of a text page to R_X in the extended
>> page table and not allowing write permissions in the EPT after that. So, jump label
>> patching during kernel boot is not a problem. But doing it after kernel
>> boot is a problem.
>
> But you see, that's exactly what the kernel already does with the normal
> permissions. They get set to RX after init and are never changed.
>
> See the previous patch, we establish a read-write alias and write there.
>
> You seem to lack basic understanding of how the kernel works in this
> regard, which makes me very nervous about you touching any of this.
>
> I must also say I really dislike your extra/random permssion calls all
> over the place. They don't really get us anything afaict. Why can't you
> plumb into the existing set_memory_*() family?
I have responded to your comments on your other email. Please read my
response there.
Thanks.
Madhavan
Powered by blists - more mailing lists