[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SN6PR02MB415717E09C249A31F2A4E229D4BCA@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Tue, 28 Nov 2023 19:12:33 +0000
From: Michael Kelley <mhklinux@...look.com>
To: "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>
CC: "tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
"kys@...rosoft.com" <kys@...rosoft.com>,
"haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>,
"decui@...rosoft.com" <decui@...rosoft.com>,
"luto@...nel.org" <luto@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"urezki@...il.com" <urezki@...il.com>,
"hch@...radead.org" <hch@...radead.org>,
"lstoakes@...il.com" <lstoakes@...il.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>,
"ardb@...nel.org" <ardb@...nel.org>,
"jroedel@...e.de" <jroedel@...e.de>,
"seanjc@...gle.com" <seanjc@...gle.com>,
"rick.p.edgecombe@...el.com" <rick.p.edgecombe@...el.com>,
"sathyanarayanan.kuppuswamy@...ux.intel.com"
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: RE: [PATCH v2 0/8] x86/coco: Mark CoCo VM pages not present when
changing encrypted state
From: kirill.shutemov@...ux.intel.com <kirill.shutemov@...ux.intel.com> Sent: Friday, November 24, 2023 2:06 AM
>
> On Tue, Nov 21, 2023 at 01:20:08PM -0800, mhkelley58@...il.com wrote:
> > From: Michael Kelley <mhklinux@...look.com>
> >
> > In a CoCo VM when a page transitions from encrypted to decrypted, or vice
> > versa, attributes in the PTE must be updated *and* the hypervisor must
> > be notified of the change.
>
> Strictly speaking it is not true for TDX. Conversion to shared can be
> implicit: set shared bit and touch the page will do the conversion. MapGPA
> is optional.
Interesting. Given that, is there a reason to use the explicit
hypervisor callbacks in for private->shared transitions in
__set_mem_enc_pgtable()? It probably doesn't have direct relevance
to this patch series, but I'm just trying to understand the tradeoffs of
the implicit vs. explicit approach. And am I correct that
shared->private transitions must use the explicit approach?
>
> > Because there are two separate steps, there's
> > a window where the settings are inconsistent. Normally the code that
> > initiates the transition (via set_memory_decrypted() or
> > set_memory_encrypted()) ensures that the memory is not being accessed
> > during a transition, so the window of inconsistency is not a problem.
> > However, the load_unaligned_zeropad() function can read arbitrary memory
> > pages at arbitrary times, which could read a transitioning page during
> > the window. In such a case, CoCo VM specific exceptions are taken
> > (depending on the CoCo architecture in use). Current code in those
> > exception handlers recovers and does "fixup" on the result returned by
> > load_unaligned_zeropad(). Unfortunately, this exception handling can't
> > work in paravisor scenarios (TDX Paritioning and SEV-SNP in vTOM mode)
> > if the exceptions are routed to the paravisor. The paravisor can't
> > do load_unaligned_zeropad() fixup, so the exceptions would need to
> > be forwarded from the paravisor to the Linux guest, but there are
> > no architectural specs for how to do that.
>
> Hm. Can't we inject #PF (or #GP) into L2 if #VE/#VC handler in L1 sees
> cross-page access to shared memory while no fixup entry for the page in
> L1. It would give L2 chance to handle the situation in a transparent way.
>
> Maybe I miss something, I donno.
I'm recounting what the Hyper-V paravisor folks say without knowing all the
details. :-( But it seems like any kind of forwarding scheme needs to be a
well-defined contract that would work for both TDX and SEV-SNP. The
paravisor in L1 might or might not be Linux-based, so the contract must be OS
independent. And the L2 guest might or might not be Linux, so there's
potential for some other kind of error to be confused with a Linux
load_unaligned_zeropad() reference.
Maybe all that could be sorted out, but I come back to believing that it's
better now and in the long run to just avoid all this complexity by decoupling
private-shared page transitions and Linux load_unaligned_zeropad().
Unfortunately that decoupling hasn't been as simple as I first envisioned
because of SEV-SNP PVALIDATE needing a virtual address. But doing the
decoupling only in the paravisor case still seems like the simpler approach.
Michael
>
> --
> Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists