lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Feb 2019 11:10:45 +0100
From:   Corentin Labbe <clabbe.montjoie@...il.com>
To:     Robin Murphy <robin.murphy@....com>
Cc:     will.deacon@....com, mark.rutland@....com, suzuki.poulose@....com,
        peterz@...radead.org, bigeasy@...utronix.de,
        linux-kernel@...r.kernel.org, tglx@...utronix.de,
        "Li, Meng" <Meng.Li@...driver.com>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance

On Mon, Feb 04, 2019 at 05:09:04PM +0000, Robin Murphy wrote:
> The arm-cci probe logic faces a cyclic dependency wherein it has to pick
> a valid CPU to associate with before registering the PMU device, has to
> have the PMU state initialised before handling hotplug events in case it
> must be migrated, but has to have the hotplug notifier registered before
> the chosen CPU may go offline lest things get out of sync. The present
> code has tried to solve the races by using get_cpu() to pick the current
> CPU and prevent it from disappearing while the other two registrations
> are performed, but that results in taking mutexes with preemption
> disabled, which makes certain configurations very unhappy:
> 
> [ 1.983337] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:2004
> [ 1.983340] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
> [ 1.983342] Preemption disabled at:
> [ 1.983353] [<ffffff80089801f4>] cci_pmu_probe+0x1dc/0x488
> [ 1.983360] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.20-rt8-yocto-preempt-rt #1
> [ 1.983362] Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
> [ 1.983364] Call trace:
> [ 1.983369] dump_backtrace+0x0/0x158
> [ 1.983372] show_stack+0x24/0x30
> [ 1.983378] dump_stack+0x80/0xa4
> [ 1.983383] ___might_sleep+0x138/0x160
> [ 1.983386] __might_sleep+0x58/0x90
> [ 1.983391] __rt_mutex_lock_state+0x30/0xc0
> [ 1.983395] _mutex_lock+0x24/0x30
> [ 1.983400] perf_pmu_register+0x2c/0x388
> [ 1.983404] cci_pmu_probe+0x2bc/0x488
> [ 1.983409] platform_drv_probe+0x58/0xa8
> 
> However, we don't actually mind being preempted or migrated at this
> point; all that really matters is that whichever CPU we pick does not
> get offlined before we're done. Thus, do the robust thing and instead
> take the lock to inhibit CPU hotplug for the duration. This also
> revealed an additional race in assigning the global pointer too late
> relative to the hotplug notifier, so that gets fixed in the process.
> 
> Reported-by: "Li, Meng" <Meng.Li@...driver.com>
> Signed-off-by: Robin Murphy <robin.murphy@....com>
> ---
>  drivers/perf/arm-cci.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
> index 1bfeb160c5b1..f6d9df07ec9b 100644
> --- a/drivers/perf/arm-cci.c
> +++ b/drivers/perf/arm-cci.c
> @@ -1692,21 +1692,23 @@ static int cci_pmu_probe(struct platform_device *pdev)
>  	raw_spin_lock_init(&cci_pmu->hw_events.pmu_lock);
>  	mutex_init(&cci_pmu->reserve_mutex);
>  	atomic_set(&cci_pmu->active_events, 0);
> -	cci_pmu->cpu = get_cpu();
> +
> +	cpus_read_lock();
> +	cci_pmu->cpu = smp_processor_id();
>  
>  	ret = cci_pmu_init(cci_pmu, pdev);
> -	if (ret) {
> -		put_cpu();
> -		return ret;
> -	}
> +	if (ret)
> +		goto out;
>  
> -	cpuhp_setup_state_nocalls(CPUHP_AP_PERF_ARM_CCI_ONLINE,
> -				  "perf/arm/cci:online", NULL,
> -				  cci_pmu_offline_cpu);
> -	put_cpu();
>  	g_cci_pmu = cci_pmu;
> +	cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_PERF_ARM_CCI_ONLINE,
> +					     "perf/arm/cci:online", NULL,
> +					     cci_pmu_offline_cpu);
> +
>  	pr_info("ARM %s PMU driver probed", cci_pmu->model->name);
> -	return 0;
> +out:
> +	cpus_read_unlock();
> +	return ret;
>  }
>  
>  static int cci_pmu_remove(struct platform_device *pdev)
> -- 
> 2.20.1.dirty

Hello

Thanks, this patch fix my issue that I has reported here:
https://lkml.org/lkml/2017/12/29/139
https://lkml.org/lkml/2018/11/12/1901

Tested-by: Corentin Labbe <clabbe.montjoie@...il.com>
Tested-on: sun8i-a83t-bananapi-m3

Regards

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ