linux-kernel - Re: [PATCH] x86/its: use Sapphire Rapids+ feature to opt out

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1D952EBC-CA16-49FE-8AD0-48BCE038332B@nutanix.com>
Date: Tue, 21 Oct 2025 13:40:01 +0000
From: Jon Kohler <jon@...anix.com>
To: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
CC: Dave Hansen <dave.hansen@...el.com>, Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>,
        Josh
 Poimboeuf <jpoimboe@...nel.org>,
        Jonathan Corbet <corbet@....net>, Ingo
 Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        Brian
 Gerst <brgerst@...il.com>,
        Brendan Jackman <jackmanb@...gle.com>,
        "Ahmed S.
 Darwish" <darwi@...utronix.de>,
        Alexandre Chartre
	<alexandre.chartre@...cle.com>,
        "linux-doc@...r.kernel.org"
	<linux-doc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/its: use Sapphire Rapids+ feature to opt out



> On Oct 20, 2025, at 6:41 PM, Pawan Gupta <pawan.kumar.gupta@...ux.intel.com> wrote:
> 
> !-------------------------------------------------------------------|
>  CAUTION: External Email
> 
> |-------------------------------------------------------------------!
> 
> On Mon, Oct 20, 2025 at 03:09:41PM -0700, Dave Hansen wrote:
>> On 10/20/25 13:40, Pawan Gupta wrote:
>>>> I can’t speak to other VMMs (e.g. vmw, hyperv, hyperscalers) and how they do
>>>> it, but I suspect there are similar challenges around post-launch feature/bit
>>>> additions that require the VM to be completely cold-booted.

First thing, I want to apologize for the confusion yesterday when talking about the
QEMU enablement, I’m sorry I mis-represented that. Wasn’t intentional, that was
me getting my wires crossed.

The baseline enablement its-no is there if you’ve got both the QEMU and kernel
side together, its-no=yes should work. Not that isn’t exposed natively on any CPU
models from the looks of it. I can propose a patch on the qemu side for that.

>>> Ok, that makes BUS_LOCK_DETECT a better choice than BHI_CTRL. I think it
>>> be better to replace BHI_CTRL with BUS_LOCK_DETECT.
>> 
>> Folks, I just think this kind of random feature spaghetti voodoo is a
>> bad idea. Suppose X86_FEATURE_BUS_LOCK_DETECT is in silicon on an
>> affected part but normally fused off. But a big customer shows up with a
>> big checkbook and Intel releases microcode to enumerate
>> X86_FEATURE_BUS_LOCK_DETECT on an affected part.
> 
> Hmm, right.
> 
>> What then?
>> 
>> Your only choice is to convince Intel to make architectural the idea
>> that X86_FEATURE_BUS_LOCK_DETECT is never enumerated on an affected part.
>> 
>> Because even if we go forward with that patch we've *DONE* that in
>> Linux: we've made it de facto architecture and Intel can never change it.
> 
> Using BHI_CTRL here was in agreement with CPU architects. Even though its a
> heuristic, it is very unlikely to be broken by a microcode update.
> 
> I can't say for sure about BUS_LOCK_DETECT.
> 
>> Can someone try to boil down the problem statement for me again, please?
>> 
>> VMs are slow because of mitigations for issues to which they are
>> not vulnerable when running old kernels on old hypervisors.

The problem statement is that, ITS is on by default, on non-impacted hardware,
and at least for the QEMU ecosystem, the feature chosen (BHI_CTRL) was not
exposed at the same time the SPR model was introduced, so guests on platforms
that don’t have BHI_CTRL or ITS_NO for any reason will be impacted for
performance, mitigating against an issue that they don’t actually have.

So to simplify it down:
A guest VM that updates to a ITS-enabled guest kernel sees performance
impacts on non-vulnerable hardware, when running on non-BHI_CTRL and/or
non-ITS_NO hypervisors, which is a very easy situation to get into, especially
on QEMU with live migration-enabled pools.

> 
> From what I understand:
> 
>  Unless a VM is cold-booted, it cannot see the new features/immunity bits
>  exposed by the hypervisor. In this particular case, a guest gets the
>  updated kernel with ITS mitigation, but can't see the immunity bit unless
>  it is cold-booted.
> 
>  The other part of the problem is when host kernel/hypervisor is not
>  updated. In this case immunity bit is not exposed to the guest at all.
> 
> My 2 cents: All of this makes me feel the instead of exposing the immunity
> bit, a guest should be told about the bug presence. That way security
> minded users who update regularly get the bug enumerations, and hence the
> mitigations. OTOH, performance focused users who don't update/cold-boot
> often don't get unnecessarily slowed down.