lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <qvgmqkytn44ysjbox6b2db75ri6yc3lmncpb7etiajnueeuab2@6mrdmoqqnebn>
Date: Thu, 27 Feb 2025 10:59:12 -0600
From: Lucas De Marchi <lucas.demarchi@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <mingo@...nel.org>, <ravi.bangoria@....com>,
	<linux-kernel@...r.kernel.org>, <willy@...radead.org>, <acme@...nel.org>,
	<namhyung@...nel.org>, <mark.rutland@....com>,
	<alexander.shishkin@...ux.intel.com>, <jolsa@...nel.org>,
	<irogers@...gle.com>, <adrian.hunter@...el.com>, <kan.liang@...ux.intel.com>,
	<intel-xe@...ts.freedesktop.org>
Subject: Re: [PATCH v2 06/24] perf: Fix pmus_lock vs pmus_srcu ordering

On Wed, Feb 05, 2025 at 11:21:26AM +0100, Peter Zijlstra wrote:
>Commit a63fbed776c7 ("perf/tracing/cpuhotplug: Fix locking order")
>placed pmus_lock inside pmus_srcu, this makes perf_pmu_unregister()
>trip lockdep.
>
>Move the locking about such that only pmu_idr and pmus (list) are

s/about/around/ ?

>modified while holding pmus_lock. This avoids doing synchronize_srcu()
>while holding pmus_lock and all is well again.
>
>Fixes: a63fbed776c7 ("perf/tracing/cpuhotplug: Fix locking order")
>Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>

Without this patch, I occasionally see the locking issue below, but it
doesn't always happen. I'm running with this patch and haven't seen the
issue since.

[  268.195731] ======================================================
[  268.201945] WARNING: possible circular locking dependency detected
[  268.208165] 6.14.0-rc4-xe+ #59 Tainted: G     U
[  268.213777] ------------------------------------------------------
[  268.219988] kmod-unbind/2509 is trying to acquire lock:
[  268.225238] ffffffff8b40b310 (&pmus_srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x21/0x270
[  268.233374]
                but task is already holding lock:
[  268.239241] ffffffff86381de8 (pmus_lock){+.+.}-{3:3}, at: perf_pmu_unregister+0x25/0x2f0
[  268.247372]
                which lock already depends on the new lock.

[  268.255577]
                the existing dependency chain (in reverse order) is:
[  268.263101]
                -> #1 (pmus_lock){+.+.}-{3:3}:
[  268.268717]        __mutex_lock+0x1a4/0x1300
[  268.273019]        mutex_lock_nested+0x1b/0x30
[  268.277489]        perf_swevent_init+0x15d/0x610
[  268.282142]        perf_try_init_event+0x111/0xba0
[  268.286971]        perf_event_alloc+0x1024/0x3eb0
[  268.291703]        __do_sys_perf_event_open+0x3b0/0x2460
[  268.297046]        __x64_sys_perf_event_open+0xbe/0x160
[  268.302305]        x64_sys_call+0x1eb0/0x2650
[  268.306689]        do_syscall_64+0x91/0x180
[  268.310918]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  268.316524]
                -> #0 (&pmus_srcu){.+.+}-{0:0}:
[  268.322215]        __lock_acquire+0x3339/0x6980
[  268.326777]        lock_sync+0xec/0x180
[  268.330645]        __synchronize_srcu+0xaf/0x270
[  268.335291]        synchronize_srcu+0x64/0x2b0
[  268.339776]        perf_pmu_unregister+0xed/0x2f0
[  268.344523]        xe_pmu_unregister+0x94/0xf0 [xe]
[  268.349604]        devm_action_release+0x49/0x80
[  268.354257]        devres_release_all+0x180/0x240
[  268.358996]        device_unbind_cleanup+0x1b/0x1b0
[  268.363910]        device_release_driver_internal+0x44a/0x5b0
[  268.369693]        device_driver_detach+0x36/0x50
[  268.374435]        unbind_store+0xe0/0x100
[  268.378571]        drv_attr_store+0x6a/0xc0
[  268.382782]        sysfs_kf_write+0x106/0x160
[  268.387176]        kernfs_fop_write_iter+0x359/0x550
[  268.392180]        vfs_write+0x64c/0x1030
[  268.396234]        ksys_write+0x11a/0x240
[  268.400284]        __x64_sys_write+0x72/0xc0
[  268.404587]        x64_sys_call+0x2c4/0x2650
[  268.408885]        do_syscall_64+0x91/0x180
[  268.413102]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  268.418708]
                other info that might help us debug this:

[  268.426746]  Possible unsafe locking scenario:

[  268.432715]        CPU0                    CPU1
[  268.437277]        ----                    ----
[  268.441842]   lock(pmus_lock);
[  268.444932]                                lock(&pmus_srcu);
[  268.450633]                                lock(pmus_lock);
[  268.456243]   sync(&pmus_srcu);
[  268.459419]
                 *** DEADLOCK ***

[  268.465375] 4 locks held by kmod-unbind/2509:
[  268.469779]  #0: ffff888135689420 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0x11a/0x240
[  268.477911]  #1: ffff8881107aee88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x221/0x550
[  268.486829]  #2: ffff888111cc41b0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0xa2/0x5b0
[  268.496547]  #3: ffffffff86381de8 (pmus_lock){+.+.}-{3:3}, at: perf_pmu_unregister+0x25/0x2f0


Tested-by: Lucas De Marchi <lucas.demarchi@...el.com>

Lucas De Marchi

>---
> kernel/events/core.c |    4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>--- a/kernel/events/core.c
>+++ b/kernel/events/core.c
>@@ -11836,6 +11836,8 @@ void perf_pmu_unregister(struct pmu *pmu
> {
> 	mutex_lock(&pmus_lock);
> 	list_del_rcu(&pmu->entry);
>+	idr_remove(&pmu_idr, pmu->type);
>+	mutex_unlock(&pmus_lock);
>
> 	/*
> 	 * We dereference the pmu list under both SRCU and regular RCU, so
>@@ -11845,7 +11847,6 @@ void perf_pmu_unregister(struct pmu *pmu
> 	synchronize_rcu();
>
> 	free_percpu(pmu->pmu_disable_count);
>-	idr_remove(&pmu_idr, pmu->type);
> 	if (pmu_bus_running && pmu->dev && pmu->dev != PMU_NULL_DEV) {
> 		if (pmu->nr_addr_filters)
> 			device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
>@@ -11853,7 +11854,6 @@ void perf_pmu_unregister(struct pmu *pmu
> 		put_device(pmu->dev);
> 	}
> 	free_pmu_context(pmu);
>-	mutex_unlock(&pmus_lock);
> }
> EXPORT_SYMBOL_GPL(perf_pmu_unregister);
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ