[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aJ-pJvrPyHyPI0qS@google.com>
Date: Fri, 15 Aug 2025 14:39:50 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Mario Limonciello <mario.limonciello@....com>, Yazen Ghannam <yazen.ghannam@....com>, x86@...nel.org,
linux-kernel@...r.kernel.org, Libing He <libhe@...hat.com>,
David Arcari <darcari@...hat.com>
Subject: Re: [PATCH] x86/CPU/AMD: Ignore invalid reset reason value
On Fri, Jul 25, 2025, Borislav Petkov wrote:
> On Thu, Jul 24, 2025 at 02:32:41PM -0700, Sean Christopherson wrote:
> > Not necessarily. There are a variety of use cases for doing nearly-full passthrough
> > of bare metal state into a VM, e.g. to deprivilege the "main" OS, interpose and/or
> > isolate select resources, etc. I don't think it's too far fetched to imagine one
> > or more such use cases exposing this range of MMIO to the guest _and_ also setting
> > the HYPERVISOR bit in CPUID.
>
> What would be that use case?
>
> To tell the guest why the hypervisor rebooted?
>
> Sounds weird to me but virt does weird things sometimes. :-P
>
> > But whether or not there's a legitimate use case is beside the point. I'm not
> > arguing this is at all useful for VMs. I'm arguing _against_ splattering
> > X86_FEATURE_HYPERVISOR checks all over the place just because an error was first
> > (or only) observed while running in a VM.
>
> We use X86_FEATURE_HYPERVISOR to gate purely-hw-only features behind it.
And I'm advocating that we stop doing that (with maybe a few exceptions), because
there are very, very few features that are truly bare metal only. Some things,
e.g. SEV-SNP, might require proxying certain actions, but even SEV-SNP can be
exposed to a virtual machine.
> Because they don't make any sense for guests. Just like this one.
You're conflating running under (on?) a hypervisor with being a guest in the
traditional model of virtualization. Running a privileged, host-owned software
stack in a virtual machine is beneficial/desirable for a variety of use cases,
and I expect the number of such use cases to grow in the near future.
E.g. as mentioned earlier, pKVM uses virtualization to de-privilege the kernel,
and Hyper-V's VBS uses virtualization to monitor accesses to sensitive state.
Future use cases could use virtualization:
- to improve system stability/uptime, e.g. restart the VM if the kernel crashes
as opposed to rebooting the entire host
- to isolate different software components, e.g. to run post-boot functionality
in a separate VM to limit the impact of an escape/flaw/bug
- to limit the amount of code that has direct access to system physical resources
- to partion a system into multiple machines (each running their own kernel)
for performance reason; e.g. AIUI, kernel-wide locks are becomming more and
more problematic as the number of cores continues to go up
Not all of those use cases will want to access "bare metal" features, but my
point is that inferring anything about the system setup based solely on the
HYPERVISOR flag is all but guaranteed to break sooner or later.
> And even if this one makes sense for some virt scenario, we want those
> folks to *actually* explain why removing the HV check for that feature
> makes sense. And why the kernel needs to support it.
That seems like a gigantic waste of time/effort. E.g. imagine the pain/churn if
MONITOR/MWAIT had been limited to !HYPERVISOR simply because early VM setups
didn't support executing MONITOR/MWAIT natively.
> Just like loading microcode in a guest, for example. There's a reason we
> don't do that. And when some day, someone appears and wants to do that, I
> would like there to be an explicit patch removing that HV check in the loader
> and explaining *why* it is doing so.
Patch loading is exceptional because, AFAIK, neither Intel nor AMD officially
support ucode updates while running as a guest. Though FWIW, I do expect that
sooner or later someone will indeed want to do patch loads in a VM.
And even if Intel/AMD never support doing the actual patch load from guest code,
I can definitely see a trap-and-execute scheme, e.g. where the bare metal kernel
doesn't have network or file system access, and so the ucode patch needs to be
routed through a highly privileged VM.
> And until that day, that feature is hw-only. Just like this one.
>
> And yeah, I know you don't like X86_FEATURE_HYPERVISOR
I don't mind HYPERVISOR per se, what I don't like is disabling code (or going
down different paths) based solely on historical use cases, i.e. based on the
assumption that "no one will ever do xyz!". I have no objection to pivoting
on HYPERVISOR if there is an explicitly documented limitation, or for cases where
there's a very high probability of making a bad decision, e.g. using FMS to
enumerate features.
The odds of _this_ particular code patch being problematic/interesting are low,
but I really don't want to set a precedent that it's ok to gate code on !HYPERVISOR
without a strong, documented technical reason for doing so.
> but I would like to save some of my sanity when looking at a bug report which
> says that the reboot reason reporting is talking crap, only to find out that
> a HV underneath is doing something silly, muddying up the waters.
That cuts both ways though, e.g. I don't want to debug issues due to Linux not
behaving as expected because someone incorrectly assumed a feature would never
be exposed to a virtual machine.
IMO, if you're regularly spending more than a few seconds sussing out the
environment from a bug report, then we need to improve reporting processes and/or
infrastructure.
Powered by blists - more mailing lists