[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zj5ETYPTUo9T4Nuf@google.com>
Date: Fri, 10 May 2024 08:59:09 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Michael Roth <michael.roth@....com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, linux-coco@...ts.linux.dev,
linux-mm@...ck.org, linux-crypto@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
jroedel@...e.de, thomas.lendacky@....com, hpa@...or.com, ardb@...nel.org,
vkuznets@...hat.com, jmattson@...gle.com, luto@...nel.org,
dave.hansen@...ux.intel.com, slp@...hat.com, pgonda@...gle.com,
peterz@...radead.org, srinivas.pandruvada@...ux.intel.com,
rientjes@...gle.com, dovmurik@...ux.ibm.com, tobin@....com, bp@...en8.de,
vbabka@...e.cz, kirill@...temov.name, ak@...ux.intel.com, tony.luck@...el.com,
sathyanarayanan.kuppuswamy@...ux.intel.com, alpergun@...gle.com,
jarkko@...nel.org, ashish.kalra@....com, nikunj.dadhania@....com,
pankaj.gupta@....com, liam.merwick@...cle.com, papaluri@....com,
Isaku Yamahata <isaku.yamahata@...el.com>
Subject: Re: [PATCH v15 21/23] KVM: MMU: Disable fast path for private memslots
On Fri, May 10, 2024, Michael Roth wrote:
> On Fri, May 10, 2024 at 03:50:26PM +0200, Paolo Bonzini wrote:
> > On Fri, May 10, 2024 at 3:47 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > >
> > > > + * Since software-protected VMs don't have a notion of a shared vs.
> > > > + * private that's separate from what KVM is tracking, the above
> > > > + * KVM_EXIT_MEMORY_FAULT condition wouldn't occur, so avoid the
> > > > + * special handling for that case for now.
> > >
> > > Very technically, it can occur if userspace _just_ modified the attributes. And
> > > as I've said multiple times, at least for now, I want to avoid special casing
> > > SW-protected VMs unless it is *absolutely* necessary, because their sole purpose
> > > is to allow testing flows that are impossible to excercise without SNP/TDX hardware.
> >
> > Yep, it is not like they have to be optimized.
>
> Ok, I thought there were maybe some future plans to use sw-protected VMs
> to get some added protections from userspace. But even then there'd
> probably still be extra considerations for how to handle access tracking
> so white-listing them probably isn't right anyway.
>
> I was also partly tempted to take this route because it would cover this
> TDX patch as well:
>
> https://lore.kernel.org/lkml/91c797997b57056224571e22362321a23947172f.1705965635.git.isaku.yamahata@intel.com/
Hmm, I'm pretty sure that patch is trying to fix the exact same issue you are
fixing, just in a less precise way. S-EPT entries only support RWX=0 and RWX=111b,
i.e. it should be impossible to have a write-fault to a present S-EPT entry.
And if TDX is running afoul of this code:
if (!fault->present)
return !kvm_ad_enabled();
then KVM should do the sane thing and require A/D support be enabled for TDX.
And if it's something else entirely, that changelog has some explaining to do.
> and avoid any weirdness about checking kvm_mem_is_private() without
> checking mmu_invalidate_seq, but I think those cases all end up
> resolving themselves eventually and added some comments around that.
Yep, checking state that is protected by mmu_invalidate_seq outside of mmu_lock
is definitely allowed, e.g. the entire fast page fault path operates outside of
mmu_lock and thus outside of mmu_invalidate_seq's purview.
It's a-ok because the SPTE are done with an atomic CMPXCHG, and so KVM only needs
to ensure its page tables aren't outright _freed_. If the zap triggered by the
attributes change "wins", then the fast #PF path will fail the CMPXCHG and be an
expensive NOP. If the fast #PF wins, the zap will pave over the fast #PF fix,
and the IPI+flush that is needed for all zaps, to ensure vCPUs don't have stale
references, does the rest.
And if there's an attributes race that causes the fast #PF to bail early, the vCPU
will see the correct state on the next page fault.
Powered by blists - more mailing lists