lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c271cbf5-dfb5-c3dd-002b-9a358c90e984@intel.com>
Date:   Wed, 22 Jun 2022 08:32:38 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>, linux-kernel@...r.kernel.org
Cc:     "Chang S. Bae" <chang.seok.bae@...el.com>,
        Jiri Olsa <jolsa@...nel.org>, linux-perf-users@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        "David S. Miller" <davem@...emloft.net>,
        Borislav Petkov <bp@...en8.de>,
        Kees Cook <keescook@...omium.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        Jane Malalane <jane.malalane@...rix.com>,
        Tony Luck <tony.luck@...el.com>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        "open list:CRYPTO API" <linux-crypto@...r.kernel.org>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 4/4] x86/cpuid: check for dependencies violations in CPUID
 and attempt to fix them

On 6/22/22 07:48, Maxim Levitsky wrote:
> Due to configuration bugs, sometimes a CPU feature is disabled in CPUID,
> but not features that depend on it.
> 
> While the above is not supported, the kernel should try to not crash,
> and clearing the dependent cpu caps is the best way to do it.

That's a rather paltry changelog.

If I remember correctly, there's a crystal clear problem:

	If a CPU enumerates support for AVX2 but AVX via CPUID, the
	kernel crashes.

There's also a follow-on problem.  The kernel has all the data it needs
to fix this, but just doesn't consult it:

	To make matters worse, the kernel _knows_ that this is an ill-
	advised situation: The kernel prevents itself from clearing the
	software representation of the AVX CPUID bit without also
	clearing AVX2.

	But, the kernel only consults this knowledge when it is clearing
	cpu_cap bits.  It does not consult this information when it is
	populating those cpu_cap bits.

> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 4cc79971d2d847..c83a8f447d6aed 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1469,7 +1469,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
>  			this_cpu->c_early_init(c);
>  
>  		c->cpu_index = 0;
> -		filter_cpuid_features(c, false);
> +		filter_cpuid_features(c, true);
>  
>  		if (this_cpu->c_bsp_init)
>  			this_cpu->c_bsp_init(c);
> @@ -1757,7 +1757,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
>  	 */
>  
>  	/* Filter out anything that depends on CPUID levels we don't have */
> -	filter_cpuid_features(c, true);
> +	filter_cpuid_features(c, false);
>  
>  	/* If the model name is still unset, do table lookup. */
>  	if (!c->x86_model_id[0]) {

While we're at it, could we please rid ourselves of this unreadable
mystery true/false gunk?

> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index bcb091d02a754b..6d9c0e39851805 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -94,6 +94,11 @@ static inline void clear_feature(struct cpuinfo_x86 *c, unsigned int feature)
>  		set_bit(feature, (unsigned long *)cpu_caps_cleared);
>  }
>  
> +static inline bool test_feature(struct cpuinfo_x86 *c, unsigned int feature)
> +{
> +	return test_bit(feature, (unsigned long *)c->x86_capability);
> +}
> +
>  /* Take the capabilities and the BUG bits into account */
>  #define MAX_FEATURE_BITS ((NCAPINTS + NBUGINTS) * sizeof(u32) * 8)
>  
> @@ -127,6 +132,7 @@ void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
>  	} while (changed);
>  }
>  
> +
>  void setup_clear_cpu_cap(unsigned int feature)

More superfluous whitespace.

>  {
>  	clear_cpu_cap(&boot_cpu_data, feature);
> @@ -137,6 +143,10 @@ void setup_clear_cpu_cap(unsigned int feature)
>   * Some CPU features depend on higher CPUID levels, which may not always
>   * be available due to CPUID level capping or broken virtualization
>   * software.  Add those features to this table to auto-disable them.
> + *
> + * Also due to configuration bugs, some CPUID features might be present
> + * while CPUID features that they depend on are not present,
> + * e.g a AVX2 present but AVX is not present.
>   */
>  struct cpuid_dependent_feature {
>  	u32 feature;
> @@ -151,9 +161,10 @@ cpuid_dependent_features[] = {
>  	{ 0, 0 }
>  };
>  
> -void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn)
> +void filter_cpuid_features(struct cpuinfo_x86 *c, bool early)
>  {

I have at least an inkling what 'warn' could mean.  But, 'early'?  One
man's 'early' is another one's 'late'.

>  	const struct cpuid_dependent_feature *df;
> +	const struct cpuid_dep *d;
>  
>  	for (df = cpuid_dependent_features; df->feature; df++) {
>  
> @@ -172,10 +183,22 @@ void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn)
>  			continue;
>  
>  		clear_cpu_cap(c, df->feature);
> -		if (!warn)
> +		if (early)
>  			continue;

Why is it that 'early' calls don't want warnings?

>  		pr_warn("CPU: CPU feature " X86_CAP_FMT " disabled, no CPUID level 0x%x\n",
>  			x86_cap_flag(df->feature), df->level);
>  	}
> +
> +	for (d = cpuid_deps; d->feature; d++) {
> +
> +		if (!test_feature(c, d->feature) || test_feature(c, d->depends))
> +			continue;
> +
> +		clear_feature(c, d->feature);
> +
> +		pr_warn("CPU: CPU feature " X86_CAP_FMT " disabled, because it depends on "
> +			X86_CAP_FMT " which is not supported in CPUID\n",
> +			x86_cap_flag(d->feature), x86_cap_flag(d->depends));
> +	}
>  }

The do_clear_cpu_cap() does this with a loop, presumably because a later
(higher index in the array) feature in cpuid_deps[] could theoretically
clear an earlier (lower index) feature.

Also, is that message strictly correct?  There might have been a
clearcpuid= argument or even another dependency that ended up clearing a
bit.  It might have nothing to do with CPUID itself.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ