[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250814173142.632749-2-ysk@kzalloc.com>
Date: Thu, 14 Aug 2025 17:31:43 +0000
From: Yunseong Kim <ysk@...lloc.com>
To: Krzysztof Kozlowski <krzk@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>
Cc: Simon Horman <horms@...nel.org>,
Taehee Yoo <ap420073@...il.com>,
Byungchul Park <byungchul@...com>,
max.byungchul.park@...il.com,
yeoreum.yun@....com,
ppbuk5246@...il.com,
netdev@...r.kernel.org,
linux-kernel@...r.kernel.org,
Yunseong Kim <ysk@...lloc.com>
Subject: [PATCH] net/nfc: Fix A-B/B-A deadlock between nfc_unregister_device and rfkill_fop_write
A potential deadlock due to A-B/B-A deadlock exists between the NFC core
and the RFKill subsystem, involving the NFC device lock and the
rfkill_global_mutex.
This issue is particularly visible on PREEMPT_RT kernels, which can
report the following warning:
| rtmutex deadlock detected
| WARNING: CPU: 0 PID: 22729 at kernel/locking/rtmutex.c:1674 rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
| Modules linked in:
| CPU: 0 UID: 0 PID: 22729 Comm: syz.7.2187 Kdump: loaded Not tainted 6.17.0-rc1-00001-g1149a5db27c8-dirty #55 PREEMPT_RT
| Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8ubuntu1 06/11/2025
| pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
| pc : rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1
| lr : rt_mutex_handle_deadlock+0x40/0xec kernel/locking/rtmutex.c:1674
| sp : ffff8000967c7720
| x29: ffff8000967c7720 x28: 1fffe0001946d182 x27: dfff800000000000
| x26: 0000000000000001 x25: 0000000000000003 x24: 1fffe0001946d00b
| x23: 1fffe0001946d182 x22: ffff80008aec8940 x21: dfff800000000000
| x20: ffff0000ca368058 x19: ffff0000ca368c10 x18: ffff80008af6b6e0
| x17: 1fffe000590b8088 x16: ffff80008046cc08 x15: 0000000000000001
| x14: 1fffe000590ba990 x13: 0000000000000000 x12: 0000000000000000
| x11: ffff6000590ba991 x10: 0000000000000002 x9 : 0fe446e029bcfe00
| x8 : 0000000000000000 x7 : 0000000000000000 x6 : 000000000000003f
| x5 : 0000000000000001 x4 : 0000000000001000 x3 : ffff800080503efc
| x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000001
| Call trace:
| rt_mutex_handle_deadlock+0x68/0xec kernel/locking/rtmutex.c:-1 (P)
| __rt_mutex_slowlock+0x1cc/0x480 kernel/locking/rtmutex.c:1734
| __rt_mutex_slowlock_locked kernel/locking/rtmutex.c:1760 [inline]
| rt_mutex_slowlock+0x140/0x21c kernel/locking/rtmutex.c:1800
| __rt_mutex_lock kernel/locking/rtmutex.c:1815 [inline]
| __mutex_lock_common kernel/locking/rtmutex_api.c:536 [inline]
| mutex_lock+0xf0/0x10c kernel/locking/rtmutex_api.c:603
| device_lock include/linux/device.h:911 [inline]
| nfc_dev_down net/nfc/core.c:143 [inline]
| nfc_rfkill_set_block+0x48/0x2a4 net/nfc/core.c:179
| rfkill_set_block+0x184/0x364 net/rfkill/core.c:346
| rfkill_fop_write+0x4dc/0x624 net/rfkill/core.c:1301
| vfs_write+0x2b8/0xa30 fs/read_write.c:684
| ksys_write+0x120/0x210 fs/read_write.c:738
| __do_sys_write fs/read_write.c:749 [inline]
| __se_sys_write fs/read_write.c:746 [inline]
| __arm64_sys_write+0x7c/0x90 fs/read_write.c:746
| __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
| invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
| el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
| do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
| el0_svc+0x40/0x140 arch/arm64/kernel/entry-common.c:879
| el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:898
| el0t_64_sync+0x1ac/0x1b0 arch/arm64/kernel/entry.S:596
The scenario is as follows:
Task A (rfkill_fop_write):
1. Acquires rfkill_global_mutex.
2. Iterates devices and calls rfkill_set_block()
-> nfc_rfkill_set_block()
-> nfc_dev_down().
3. Tries to acquire NFC device_lock.
Task B (nfc_unregister_device):
1. Acquires NFC device_lock.
2. Calls rfkill_unregister().
3. Tries to acquire rfkill_global_mutex.
Task A waits for the device_lock held by Task B, while Task B waits for
the rfkill_global_mutex held by Task A.
To fix this, move the calls to rfkill_unregister() and rfkill_destroy()
outside the device_lock critical section in nfc_unregister_device().
We ensure this is safe by first acquiring the device_lock, setting the
shutting_down flag (which prevents races with nfc_dev_down()),
stashing the rfkill pointer in a local variable, nullifying the pointer
in the nfc_dev structure, and then releasing the device_lock before
calling the rfkill unregister functions. This breaks the lock inversion.
Signed-off-by: Yunseong Kim <ysk@...lloc.com>
---
net/nfc/core.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/nfc/core.c b/net/nfc/core.c
index ae1c842f9c64..c8dc6414514b 100644
--- a/net/nfc/core.c
+++ b/net/nfc/core.c
@@ -1154,6 +1154,7 @@ EXPORT_SYMBOL(nfc_register_device);
void nfc_unregister_device(struct nfc_dev *dev)
{
int rc;
+ struct rfkill *rfk = NULL;
pr_debug("dev_name=%s\n", dev_name(&dev->dev));
@@ -1163,14 +1164,18 @@ void nfc_unregister_device(struct nfc_dev *dev)
"was removed\n", dev_name(&dev->dev));
device_lock(&dev->dev);
+ dev->shutting_down = true;
if (dev->rfkill) {
- rfkill_unregister(dev->rfkill);
- rfkill_destroy(dev->rfkill);
+ rfk = dev->rfkill;
dev->rfkill = NULL;
}
- dev->shutting_down = true;
device_unlock(&dev->dev);
+ if (rfk) {
+ rfkill_unregister(rfk);
+ rfkill_destroy(rfk);
+ }
+
if (dev->ops->check_presence) {
timer_delete_sync(&dev->check_pres_timer);
cancel_work_sync(&dev->check_pres_work);
--
2.50.0
Powered by blists - more mailing lists