[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIKehTDgP-Nu36ol@google.com>
Date: Thu, 24 Jul 2025 13:58:45 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Yazen Ghannam <yazen.ghannam@....com>, x86@...nel.org, linux-kernel@...r.kernel.org,
Libing He <libhe@...hat.com>, David Arcari <darcari@...hat.com>,
Mario Limonciello <mario.limonciello@....com>
Subject: Re: [PATCH] x86/CPU/AMD: Ignore invalid reset reason value
On Wed, Jul 23, 2025, Borislav Petkov wrote:
> On July 23, 2025 9:34:26 PM GMT+03:00, Yazen Ghannam <yazen.ghannam@....com> wrote:
> >On Tue, Jul 22, 2025 at 06:56:15PM +0200, Borislav Petkov wrote:
> >> On Mon, Jul 21, 2025 at 06:11:54PM +0000, Yazen Ghannam wrote:
> >> > The reset reason value may be "all bits set", e.g. 0xFFFFFFFF. This is a
> >> > commonly used error response from hardware. This may occur due to a real
> >> > hardware issue or when running in a VM.
> >>
> >> Well, which is it Libing is reporting? VM or a real hw issue?
> >>
> >
> >In this case, it was a VM.
> >
> >> If it is a VM, is that -1 the only thing a VMM returns when reading that
> >> MMIO address or can it be anything?
> >>
> >> If latter, you need to check X86_FEATURE_HYPERVISOR.
> >>
> >> Same for a real hw issue.
> >>
> >> IOW, is -1 the *only* invalid data we can read here or are we playing
> >> whack-a-mole with it?
> >>
> >
> >I see you're point, but I don't think we can know for sure all possible
> >cases. There are some reserved bits that shouldn't be set. But these
> >definitions could change in the future.
> >
> >And it'd be a pain to try and verify combinations of bits and configs.
> >Like can bit A and B be set together, or can bit C be set while running
> >in a VM, or can bit D ever be set on Model Z?
> >
> >The -1 (all bits set) is the only "applies to all cases" invalid data,
> >since this is a common hardware error response. So we can at least check
> >for this.
> >
> >Thanks,
> >Yazen
>
> I think you should check both: HV or -1.
>
> HV covers the VM angle as they don't emulate this
You can't possibly know that. If there exists a hardware spec of any kind, it's
fair game for emulation.
> and we simply should disable this functionality when running as a guest.
>
> -1 covers the known-bad hw value.
And in a guest, -1, i.e. 0xffffffff is all but guaranteed to come from the VMM
providing PCI master abort semantics for reads to MMIO where no device exists.
That's about as "architectural" of behavior as you're going to get, so I don't
see any reason to assume no VMM will every emulate whatever this feature is.
Powered by blists - more mailing lists