lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240726181424.000039a4@Huawei.com>
Date: Fri, 26 Jul 2024 18:14:24 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>, <linuxarm@...wei.com>,
	<rafael.j.wysocki@...el.com>, <guohanjun@...wei.com>, <gshan@...hat.com>,
	<miguel.luis@...cle.com>, <catalin.marinas@....com>, Linux List Kernel
 Mailing <linux-kernel@...r.kernel.org>, Linux regressions mailing list
	<regressions@...ts.linux.dev>, Ingo Molnar <mingo@...hat.com>, "Borislav
 Petkov" <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
	<x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>, "Bowman, Terry"
	<Terry.bowman@....com>, Shameerali Kolothum Thodi
	<shameerali.kolothum.thodi@...wei.com>
Subject: Re: 6.11/regression/bisected - The commit c1385c1f0ba3 caused a new
 possible recursive locking detected warning at computer boot.

On Fri, 26 Jul 2024 18:26:01 +0200
Thomas Gleixner <tglx@...utronix.de> wrote:

> On Thu, Jul 25 2024 at 18:13, Jonathan Cameron wrote:
> > On Tue, 23 Jul 2024 18:20:06 +0100
> > Jonathan Cameron <Jonathan.Cameron@...wei.com> wrote:
> >  
> >> > This is an interesting corner and perhaps reflects a flawed
> >> > assumption we were making that for this path anything that can happen for an
> >> > initially present CPU can also happen for a hotplugged one. On the hotplugged
> >> > path the lock was always held and hence the static_key_enable() would
> >> > have failed.  
> 
> No. The original code invoked this without cpus read locked via:
> 
> acpi_processor_driver.probe()
>    __acpi_processor_start()
>        ....
> 
> and the cpu hotplug callback finds it already set up, so it won't reach
> the static_key_enable() anymore.
> 
> > One bit I need to check out tomorrow is to make sure this doesn't race with the
> > workfn that is used to tear down the same static key on error.  
> 
> There is a simpler solution for that. See the uncompiled below.

Thanks.  FWIW I got pretty much the same suggestion from Shameer this
morning when he saw the workfn solution on list. Classic case of me
missing the simple solution because I was down in the weeds.

I'm absolutely fine with this fix.

Mikhail, please could you test Thomas' proposal so we are absolutely sure
nothing else is hiding.

Tglx's solution is much less likely to cause problems than what I proposed because
it avoids changing the ordering.

Jonathan




> 
> Thanks,
> 
>         tglx
> ---
> diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
> index b3fa61d45352..0b69bfbf345d 100644
> --- a/arch/x86/kernel/cpu/aperfmperf.c
> +++ b/arch/x86/kernel/cpu/aperfmperf.c
> @@ -306,7 +306,7 @@ static void freq_invariance_enable(void)
>  		WARN_ON_ONCE(1);
>  		return;
>  	}
> -	static_branch_enable(&arch_scale_freq_key);
> +	static_branch_enable_cpuslocked(&arch_scale_freq_key);
>  	register_freq_invariance_syscore_ops();
>  	pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio);
>  }
> @@ -323,8 +323,10 @@ static void __init bp_init_freq_invariance(void)
>  	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
>  		return;
>  
> -	if (intel_set_max_freq_ratio())
> +	if (intel_set_max_freq_ratio()) {
> +		guard(cpus_read_lock)();
>  		freq_invariance_enable();
> +	}
>  }
>  
>  static void disable_freq_invariance_workfn(struct work_struct *work)
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ