lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e0fa3af9-503c-4569-86e0-571a0218c35b@citrix.com>
Date: Wed, 9 Oct 2024 14:52:29 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: "Manwaring, Derek" <derekmn@...zon.com>, seanjc@...gle.com,
 dave.hansen@...ux.intel.com
Cc: ackerleytng@...gle.com, ajones@...tanamicro.com, anup@...infault.org,
 bfoster@...hat.com, brauner@...nel.org, david@...hat.com,
 erdemaktas@...gle.com, fan.du@...el.com, fvdl@...gle.com,
 haibo1.xu@...el.com, isaku.yamahata@...el.com, jgg@...dia.com,
 jgowans@...zon.com, jhubbard@...dia.com, jthoughton@...gle.com,
 jun.miao@...el.com, kalyazin@...zon.co.uk, kent.overstreet@...ux.dev,
 kvm@...r.kernel.org, linux-fsdevel@...ck.org, linux-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org, linux-mm@...ck.org,
 maciej.wieczor-retman@...el.com, mike.kravetz@...cle.com,
 muchun.song@...ux.dev, oliver.upton@...ux.dev, pbonzini@...hat.com,
 peterx@...hat.com, pgonda@...gle.com, pvorel@...e.cz, qperret@...gle.com,
 quic_eberman@...cinc.com, richard.weiyang@...il.com, rientjes@...gle.com,
 roypat@...zon.co.uk, rppt@...nel.org, shuah@...nel.org, tabba@...gle.com,
 vannapurve@...gle.com, vkuznets@...hat.com, willy@...radead.org,
 zhiquan1.li@...el.com, graf@...zon.de, mlipp@...zon.at, canellac@...zon.at
Subject: Re: [RFC PATCH 30/39] KVM: guest_memfd: Handle folio preparation for
 guest_memfd mmap

On 09/10/2024 4:51 am, Manwaring, Derek wrote:
> On 2024-10-08 at 19:56+0000 Sean Christopherson wrote:
>> Another (slightly crazy) approach would be use protection keys to provide the
>> security properties that you want, while giving KVM (and userspace) a quick-and-easy
>> override to access guest memory.
>>
>>   1. mmap() guest_memfd into userpace with RW protections
>>   2. Configure PKRU to make guest_memfd memory inaccessible by default
>>   3. Swizzle PKRU on-demand when intentionally accessing guest memory
>>
>> It's essentially the same idea as SMAP+STAC/CLAC, just applied to guest memory
>> instead of to usersepace memory.
>>
>> The benefit of the PKRU approach is that there are no PTE modifications, and thus
>> no TLB flushes, and only the CPU that is access guest memory gains temporary
>> access.  The big downside is that it would be limited to modern hardware, but
>> that might be acceptable, especially if it simplifies KVM's implementation.
> Yeah this might be worth it if it simplifies significantly. Jenkins et
> al. showed MPK worked for stopping in-process Spectre V1 [1]. While
> future hardware bugs are always possible, the host kernel would still
> offer better protection overall since discovery of additional Spectre
> approaches and gadgets in the kernel is more likely (I think it's a
> bigger surface area than hardware-specific MPK transient execution
> issues).
>
> Patrick, we talked about this a couple weeks ago and ended up focusing
> on within-userspace protection, but I see keys can also be used to stop
> kernel access like Andrew's project he mentioned during Dave's MPK
> session at LPC [2]. Andrew, could you share that here?

This was in reference to PKS specifically (so Sapphire Rapids and
later), and also for Xen but the technique is general.

Allocate one supervisor key for the directmap (and other ranges wanting
protecting), and configure MSR_PKS[key]=AD by default.

Protection Keys were identified as being safe as a defence against
Meltdown.  At the time, only PKRU existed, and PKS was expected to have
been less overhead than KPTI on Skylake, which was even more frustrating
for those of us who'd begged for a supervisor form at the time.  What's
done is done.


The changes needed in main code would be accessors for directmap
pointers, because there needs to temporary AD-disable.  This would take
the form of 2x WRMSR, as opposed to a STAC/CLAC pair.

An area of concern is the overhead of the WRMSRs.  MSR_PKS is defined as
not-architecturally-serialising, but like STAC/CLAC probably comes with
model-dependent dispatch-serialising properties to prevent memory
accesses executing speculatively under the wrong protection key.

Also, for this strategy to be effective, you need to PKEY-tag all
aliases of the memory.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ