linux-kernel - RE: [EXTERNAL] Re: "Paravisor" Feature Enumeration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <CH8PR21MB522275E86FF5D33B04CE7A75CA87A@CH8PR21MB5222.namprd21.prod.outlook.com>
Date: Tue, 6 Jan 2026 02:12:36 +0000
From: Jon Lange <jlange@...rosoft.com>
To: Andrew Cooper <andrew.cooper3@...rix.com>, Dave Hansen
	<dave.hansen@...el.com>
CC: "Williams, Dan J" <dan.j.williams@...el.com>, Sean Christopherson
	<seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>, John Starks
	<John.Starks@...rosoft.com>, Will Deacon <will@...nel.org>, Mark Rutland
	<mark.rutland@....com>, "linux-coco@...ts.linux.dev"
	<linux-coco@...ts.linux.dev>, LKML <linux-kernel@...r.kernel.org>, "Kirill A.
 Shutemov" <kirill.shutemov@...ux.intel.com>, "Edgecombe, Rick P"
	<rick.p.edgecombe@...el.com>
Subject: RE: [EXTERNAL] Re: "Paravisor" Feature Enumeration

Andrew wrote:

> Are we saying that, inside an opaque blob that a customer provides to a CSP to run we might have:
> * a paravisor and an unaware OS, or
> * svsm and a fully-aware OS, or
> * something in-between these two.
> and we're looking a way to describe which piece of the interior stack owns which capability/service?
> I think the discussion would benefit greatly from having a couple of concrete examples of data this wants to hold,
> and how it is to be used at different levels of the interior software stack.

Here are two examples.  In both examples, the OS is running behind a paravisor but I wouldn't term it an "unaware OS".  Rather, the paravisor is present because of the set of services it provides, and it is running in paravisor mode (not SVSM mode) because the implementation benefits from taking full management responsibility for the confidential trust boundary (e.g. determination of when/how to validate/accept pages).  In such a configuration, where the paravisor has management responsibility for the confidential trust boundary, all of the enlightenments in the guest OS for managing confidentiality state must be suppressed.  The straightforward way to do this is for the paravisor to suppress the confidential VM enumeration information visible to the guest OS (the "SNP available" CPUID bit, or the "TDX active" bit, for example).

Note that this occurs out of necessity because we can't have the paravisor and the guest OS fighting over who has the right/responsibility to execute PVALIDATE, or TDG.MEM.PAGE.ACCEPT, or whatever.  The kernel today only has two concepts of its execution mode: either it is a confidential VM, in which case it takes full responsibility, or it is not a confidential VM, in which case it ignores the responsibility.  When a paravisor (not SVSM) is active, we have to operate in the second mode because the first mode would provoke precisely the conflict we're trying to avoid. 

First example: a confidential VM running under a paravisor wants to obtain an attestation report for itself to pass to a third party to vouch for the fact that it is a confidential VM.  Assume in this example that the relying party is aware of the paravisor and the paravisor's measurements, so the evidence provided in such an attestation report can successfully be verified as authentic.  In order for this to be possible, the kernel has to know that it's running in a confidential VM in a mode where attestation reports are available but where the responsibility for confidential memory state management is suppressed.  This is a third state beyond the two states described above.  This isn't just a userspace problem because access to the attestation service is mediated by a kernel-mode driver that needs to know how to configure itself (such configuration today is based on CPUID and not on ACPI).

Second example: a confidential VM running under a paravisor determines that one of the devices available to it is a TDISP device that requires the OS - not the paravisor - to perform the operations required to configure the device, to obtain and verify its attestation information, and to consent to activating the device in the TDISP RUN state.  In order for the OS to be able to execute that sequence, the device has to know that it is running as a confidential VM so it knows that TDISP configuration may be necessary.

We can quibble about whether there are better ways to accomplish these specific scenarios - for example, you could say that the availability of the attestation device should be handled by ACPI instead of CPUID and thus the firmware should take responsibility for figuring out whether it's present, and you could say that the PCI subsystem uses some additional information (possibly more ACPI information) to indicate that TDISP devices may be present.  However, these two examples are far from an exhaustive list and it's hard to imagine that we won't discover a third or fourth scenario that doesn't lend itself to bootstrapping in the firmware (and I'm even convinced that these two scenarios can neatly be handled by firmware conventions).  Defining "paravisor mode" gives us one more tool to figure out how to enable confidential services without requiring confidential management.

-Jon 

-----Original Message-----
From: Andrew Cooper <andrew.cooper3@...rix.com> 
Sent: Monday, January 5, 2026 5:45 PM
To: Dave Hansen <dave.hansen@...el.com>; Jon Lange <jlange@...rosoft.com>
Cc: Andrew Cooper <andrew.cooper3@...rix.com>; Williams, Dan J <dan.j.williams@...el.com>; Sean Christopherson <seanjc@...gle.com>; Paolo Bonzini <pbonzini@...hat.com>; John Starks <John.Starks@...rosoft.com>; Will Deacon <will@...nel.org>; Mark Rutland <mark.rutland@....com>; linux-coco@...ts.linux.dev; LKML <linux-kernel@...r.kernel.org>; Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>; Edgecombe, Rick P <rick.p.edgecombe@...el.com>
Subject: [EXTERNAL] Re: "Paravisor" Feature Enumeration

On 05/01/2026 9:42 pm, Dave Hansen wrote:
> First,
>
> Jon and John gave a talk in Tokyo about feature enumeration under
> paravisors:
>
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flpc
>> .events%2Fevent%2F19%2Fcontributions%2F2188%2Fattachments%2F1896%2F40
>> 57%2F05-Paravisor-Integration-with-Confidential-Services.pdf&data=05%
>> 7C02%7Cjlange%40microsoft.com%7C27436f719e4c465067c008de4cc53851%7C72
>> f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C639032607107200108%7CUnknown
>> %7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW
>> 4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=Yr7zgaybC%2
>> FVrIWbX3%2BVkUVDxJz8OUVsSu7EzG2hUn%2BI%3D&reserved=0
> The tl;dr for me at least was that they'd like a common and consistent 
> means of enumerating these features in OSes, regardless of the
> environment: TDX, SEV-SNP or even ARM CCA.

I agree that it seems like "just" an enumeration problem, but despite attending the presentation and rereading the slides, I'm still not clear on the precise scope.

Are we saying that, inside an opaque blob that a customer provides to a CSP to run we might have:

* a paravisor and an unaware OS, or
* svsm and a fully-aware OS, or
* something in-between these two.

and we're looking a way to describe which piece of the interior stack owns which capability/service?

If so, it can't come in from the outside; given that it's the capability enumeration, there's a chicken/egg problem with verifying the integrity.

It seems like it needs to be produced by whatever the first code to run is, after gathering capabilities in a vendor-specific way, and deciding which services it wants to provide, and which to delegate.

And if so, then it definitely cannot be in CPUID because that needs to be fixed from prior to the guest starting to run, and doesn't express dynamic properties of the system[*]


I think the discussion would benefit greatly from having a couple of concrete examples of data this wants to hold, and how it is to be used at different levels of the interior software stack.

Thanks,

~Andrew

[*] Yes, I know CPUID does have some dynamic properties.  I think most people would agree that life would be better without them.