[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <553a85a4-cbe9-4fd7-a404-2b793a807947@intel.com>
Date: Thu, 8 Jan 2026 10:11:06 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Dave Hansen <dave.hansen@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, x86@...nel.org
Cc: "H. Peter Anvin" <hpa@...or.com>, reinette.chatre@...el.com,
Kiryl Shutsemau <kas@...nel.org>, rick.p.edgecombe@...el.com,
linux-kernel@...r.kernel.org, chenyi.qiang@...el.com, chao.p.peng@...el.com
Subject: Re: [PATCH v2] x86/split_lock: Handle unexpected split lock as fatal
On 1/7/2026 11:19 PM, Dave Hansen wrote:
> On 1/7/26 05:49, Xiaoyao Li wrote:
>> + /*
>> + * If #AC occurs on split lock without X86_FEATURE_SPLIT_LOCK_DETECT
>> + * the kernel cannot handle it by disabling the detection. Treat it as
>> + * fatal regardless of the sld_state.
>> + */
>> + if (!cpu_feature_enabled(X86_FEATURE_SPLIT_LOCK_DETECT))
>> + return true;
>
> If #AC occurs on split lock without X86_FEATURE_SPLIT_LOCK_DETECT, that
> sounds more like a naughty hypervisor or buggy CPU that deserves a
> BUG_ON() rather than a situation where the kernel wants to move merrily
> along.
Yes. Such behavior is non-architectural.
1) If it happens on bare metal, the CPU is broken.
2) If it happens in guest, the hypervisor does something wrong.
> This also needs an explanation in the changelog about _why_
> X86_FEATURE_SPLIT_LOCK_DETECT isn't set and can't be set. It needs to
> explain why enumeration is not present *AND* is impossible to add.
The only case I know, where such non-architectural behavior can happen
is TDX guest. It's a virtualization case and
X86_FEATURE_SPLIT_LOCK_DETECT cannot be virtualized normally in a sane
manner because MSR_TEST_CTRL is a per-core scope MSR. Enumerating
X86_FEATURE_SPLIT_LOCK_DETECT to a guest means the guest is able to
enable/disable the feature freely by its own. However, on the HT system,
if the guest disables the feature for its vcpu, it will also disable the
feature for the sibling CPU on the same core, where the host processes
or other VMs might run. Even on non-HT system, allowing the guest to
disable the feature will violate the host purpose of not getting any
split lock when host sets to fatal mode.
On the other hand, the question can be "why getting #AC on the split
lock if the feature is not available? and if it can be fixed to not get
#AC?" For this question,
1) if it happens on bare metal, the CPU is broken. The kernel cannot fix it.
2) if it happens in guest, it should be the hypervisor enables the
feature in hardward MSR when the guest is running. To fix it, the
hypervisor can intercept the #AC and handle it itself instead of letting
the #AC be delivered to the guest. This is what KVM already does for
normal guests. However, for TDX guest, KVM cannot intercept #AC. It
needs changes in TDX module to provide such ability.
Powered by blists - more mailing lists