[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFRLqsVNSj=MCNWVsoXA335LAhvz=oELZi8kg+yFnwWW9A3jrg@mail.gmail.com>
Date: Wed, 24 Sep 2025 17:25:11 +0800
From: Cen Zhang <zzzccc427@...il.com>
To: Luiz Augusto von Dentz <luiz.dentz@...il.com>, johan.hedberg@...il.com, marcel@...tmann.org
Cc: linux-kernel@...r.kernel.org, baijiaju1990@...il.com,
zhenghaoran154@...il.com, r33s3n6@...il.com, linux-bluetooth@...r.kernel.org,
"gality369@...il.com" <gality369@...il.com>
Subject: Re: [BUG] Bluetooth: hci_sync: Fix race in hci_cmd_sync_dequeue_once
causing list corruption GPF
Hi all,
Apologies for the previous mail — the bug report formatting was not correct.
Sorry for the noise and thank you for your understanding.
Best regards,
Cen Zhang
Cen Zhang <zzzccc427@...il.com> 于2025年9月24日周三 17:22写道:
>
> hci_cmd_sync_dequeue_once() used to: (1) call
> hci_cmd_sync_lookup_entry() which takes and releases
> cmd_sync_work_lock, returning a raw pointer to the entry (2) later
> call hci_cmd_sync_cancel_entry() (re‑taking the lock) and
> list_del()/kfree() the same entry
>
> Between (1) and (2) the cmd_sync worker thread (hci_cmd_sync_work())
> can concurrently:
>
> lock cmd_sync_work_lock
> list_del() and remove the same entry
> unlocked
>
> So the list entry was accessed after it had already been deleted,
> leading to a wild memory access.
> The detailed stack trace is shown below.
>
> Oops: general protection fault, probably for non-canonical address
> 0xff7aaf8000000004: 0000 [#1] SMP KASAN PTI
> KASAN: maybe wild-memory-access in range [0xfbd59c0000000020-0xfbd59c0000000027]
> CPU: 0 UID: 0 PID: 323 Comm: kworker/u17:6 Not tainted
> 6.17.0-rc5-ge5bbb70171d1-dirty #21 PREEMPT(voluntary)
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Workqueue: hci11 hci_conn_timeout
> RIP: 0010:__list_del include/linux/list.h:195 [inline]
> RIP: 0010:__list_del_entry include/linux/list.h:218 [inline]
> RIP: 0010:list_del include/linux/list.h:229 [inline]
> RIP: 0010:_hci_cmd_sync_cancel_entry net/bluetooth/hci_sync.c:647 [inline]
> RIP: 0010:hci_cmd_sync_cancel_entry net/bluetooth/hci_sync.c:851 [inline]
> RIP: 0010:hci_cmd_sync_dequeue_once+0x660/0x950 net/bluetooth/hci_sync.c:870
> Code: c6 fc 48 8b 03 48 89 44 24 28 4c 8d 78 08 4d 89 fe 49 c1 ee 03
> 48 b9 00 00 00 00 00 fc ff df 4d 8d 24 0e 4c 89 e0 48 c1 e8 03 <0f> b6
> 04 08 84 c0 0f 85 4d 02 00 00 45 0f b6 24 24 31 ff 44 89 e6
> RSP: 0018:ffff88811a597b10 EFLAGS: 00010a02
> RAX: 1f7ab38000000004 RBX: ffff88810d76df80 RCX: dffffc0000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: dead000000000122 R08: ffff88811a597aef R09: 1ffff110234b2f5d
> R10: dffffc0000000000 R11: ffffed10234b2f5e R12: fbd59c0000000021
> R13: 1ffffda20435db7e R14: 1bd5a00000000021 R15: dead000000000108
> FS: 0000000000000000(0000) GS:ffff88826d216000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fb9aea860b8 CR3: 0000000102d50000 CR4: 00000000000006f0
> Call Trace:
> <TASK>
> hci_cancel_connect_sync+0x1bb/0x2e0
> hci_abort_conn+0x4b5/0x9a0 net/bluetooth/hci_conn.c:2958
> hci_conn_timeout+0x3a8/0x540 net/bluetooth/hci_conn.c:579
> process_one_work kernel/workqueue.c:3236 [inline]
> process_scheduled_works+0x7a8/0x1030 kernel/workqueue.c:3319
> worker_thread+0xb97/0x11d0 kernel/workqueue.c:3400
> kthread+0x3d4/0x800 kernel/kthread.c:463
> ret_from_fork+0x13b/0x1e0 arch/x86/kernel/process.c:148
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
>
> After our analysis,may be we can fix this by making these two
> operations in hci_cmd_sync_dequeue_once() atomic — i.e. protecting
> both under the same lock as below:
>
> bool hci_cmd_sync_dequeue_once(struct hci_dev *hdev,
> hci_cmd_sync_work_func_t func,
> void *data, hci_cmd_sync_work_destroy_t destroy)
> {
> - struct hci_cmd_sync_work_entry *entry;
> -
> - entry = hci_cmd_sync_lookup_entry(hdev, func, data, destroy);
> - if (!entry)
> - return false;
> -
> - hci_cmd_sync_cancel_entry(hdev, entry, -ECANCELED);
> - return true;
> + struct hci_cmd_sync_work_entry *entry;
> +
> + mutex_lock(&hdev->cmd_sync_work_lock);
> + entry = _hci_cmd_sync_lookup_entry(hdev, func, data, destroy);
> + if (!entry) {
> + mutex_unlock(&hdev->cmd_sync_work_lock);
> + return false;
> + }
> + _hci_cmd_sync_cancel_entry(hdev, entry, -ECANCELED);
> + mutex_unlock(&hdev->cmd_sync_work_lock);
> + return true;
> }
Powered by blists - more mailing lists