[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z8resWzgtZaTuzEN@gmail.com>
Date: Fri, 7 Mar 2025 12:55:29 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Sohil Mehta <sohil.mehta@...el.com>
Cc: x86@...nel.org, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H . Peter Anvin" <hpa@...or.com>, Uros Bizjak <ubizjak@...il.com>,
Sandipan Das <sandipan.das@....com>,
Sean Christopherson <seanjc@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Vegard Nossum <vegard.nossum@...cle.com>,
Tony Luck <tony.luck@...el.com>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Nikolay Borisov <nik.borisov@...e.com>,
Eric Biggers <ebiggers@...gle.com>, Xin Li <xin3.li@...el.com>,
Alexander Shishkin <alexander.shishkin@...el.com>,
Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5] x86/cpufeature: Warn about unmet feature dependencies
* Sohil Mehta <sohil.mehta@...el.com> wrote:
> Currently, the cpuid_deps[] table is only exercised when a particular
> feature gets explicitly disabled and clear_cpu_cap() is called. However,
> some of these listed dependencies might already be missing during boot.
>
> These types of errors shouldn't generally happen in production
> environments but they could sometimes sneak through, especially when VMs
> and Kconfigs are in the mix. Also, the kernel might introduce artificial
> dependencies between unrelated features such as making LAM depend on
> LASS.
>
> Unexpected failures can occur when the kernel tries to use such a
> feature. Add a simple boot time scan of the cpuid_deps[] table to detect
> the missing dependencies. One option is to disable all of such features
> during boot but that may cause regressions in existing systems. For now,
> just warn about the missing dependencies to create awareness.
>
> As a trade-off between spamming the kernel log and keeping track of all
> the features that have been warned about, only warn about the first
> missing dependency. Any subsequent unmet dependency will only be logged
> after the first one has been resolved.
>
> Features are typically represented through unsigned integers within the
> kernel though some of them have user friendly names if they are exposed
> via cpuinfo. Show the friendlier name if available otherwise display
> the X86_FEATURE_* numerals to make it easier to identify the feature.
>
> Suggested-by: Tony Luck <tony.luck@...el.com>
> Suggested-by: Ingo Molnar <mingo@...hat.com>
> Signed-off-by: Sohil Mehta <sohil.mehta@...el.com>
> ---
> v5: Don't disable features with unmet dependencies (Ingo)
> Use pr_warn_once() for the warning (Ingo)
>
> v4: https://lore.kernel.org/lkml/20241210224037.3052555-1-sohil.mehta@intel.com/
> Update the log level to pr_info() (Ingo)
> Update the char buffer size to 16 (0day warning)
> Dropped Dave's review tag since the code has changed a bit more
> than usual.
>
> v3: Picked up the review tag.
> Reworded the commit message.
>
> v2: Use cpu_has() instead of boot_cpu_has() (Sean)
> ---
> arch/x86/include/asm/cpufeature.h | 1 +
> arch/x86/kernel/cpu/common.c | 4 ++++
> arch/x86/kernel/cpu/cpuid-deps.c | 35 +++++++++++++++++++++++++++++++
> 3 files changed, 40 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index de1ad09fe8d7..1bb30e82ea77 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -149,6 +149,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
>
> extern void setup_clear_cpu_cap(unsigned int bit);
> extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
> +void scan_feature_dependencies(struct cpuinfo_x86 *c);
>
> #define setup_force_cpu_cap(bit) do { \
> \
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 7cce91b19fb2..52111d97c8ae 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1610,6 +1610,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
>
> c->cpu_index = 0;
> filter_cpuid_features(c, false);
> + scan_feature_dependencies(c);
>
> if (this_cpu->c_bsp_init)
> this_cpu->c_bsp_init(c);
> @@ -1870,6 +1871,9 @@ static void identify_cpu(struct cpuinfo_x86 *c)
> /* Filter out anything that depends on CPUID levels we don't have */
> filter_cpuid_features(c, true);
>
> + /* Scan for unmet dependencies based on the CPUID dependency table */
> + scan_feature_dependencies(c);
s/scane_feature_dependencies
/x86_check_cpufeature_deps
or so.
> +/*
> + * Return the feature "name" if available otherwise return
> + * the X86_FEATURE_* numerals to make it easier to identify
> + * the feature.
s/available otherwise
/available, otherwise
> + */
> +static const char *x86_feature_name(unsigned int feature, char *buf)
> +{
> + if (x86_cap_flags[feature])
> + return x86_cap_flags[feature];
> +
> + snprintf(buf, 16, "%d*32+%2d", feature / 32, feature % 32);
> +
> + return buf;
> +}
> +
> +void scan_feature_dependencies(struct cpuinfo_x86 *c)
> +{
> + char feature_buf[16], depends_buf[16];
> + const struct cpuid_dep *d;
> +
> + for (d = cpuid_deps; d->feature; d++) {
> + if (cpu_has(c, d->feature) && !cpu_has(c, d->depends)) {
> + /*
> + * Only warn about the first unmet dependency on the
> + * first CPU where it is encountered to avoid spamming
> + * the kernel log.
> + */
> + pr_warn_once("CPU%d: feature:%s may not work properly without feature:%s\n",
> + smp_processor_id(),
> + x86_feature_name(d->feature, feature_buf),
> + x86_feature_name(d->depends, depends_buf));
I'd make this a bit less passive-aggressive, something like:
x86 CPU feature dependency check failure: CPU%d has '%s' enabled but '%s' disabled. Kernel might be fine, but no guarantees.
Because initially most of these warnings will be about quirks and
oddities that happen to work fine in the current code.
Thanks,
Ingo
Powered by blists - more mailing lists