[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94942257-927c-efbc-b3fd-44cc097ad71f@gmail.com>
Date: Thu, 2 Sep 2021 14:41:21 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Desmond Cheong Zhi Xi <desmondcheongzx@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>, marcel@...tmann.org,
johan.hedberg@...il.com, luiz.dentz@...il.com, davem@...emloft.net,
kuba@...nel.org, sudipm.mukherjee@...il.com
Cc: linux-bluetooth@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, skhan@...uxfoundation.org,
gregkh@...uxfoundation.org,
linux-kernel-mentees@...ts.linuxfoundation.org,
syzbot+2f6d7c28bb4bf7e82060@...kaller.appspotmail.com
Subject: Re: [PATCH v6 1/6] Bluetooth: schedule SCO timeouts with delayed_work
On 9/2/21 12:32 PM, Desmond Cheong Zhi Xi wrote:
>
> Hi Eric,
>
> This actually seems to be a pre-existing error in sco_sock_connect that we now hit in sco_sock_timeout.
>
> Any thoughts on the following patch to address the problem?
>
> Link: https://lore.kernel.org/lkml/20210831065601.101185-1-desmondcheongzx@gmail.com/
syzbot is still working on finding a repro, this is obviously not trivial,
because this is a race window.
I think this can happen even with a single SCO connection.
This might be triggered more easily forcing a delay in sco_sock_timeout()
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 98a88158651281c9f75c4e0371044251e976e7ef..71ebe0243fab106c676c308724fe3a3f92a62cbd 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -84,8 +84,14 @@ static void sco_sock_timeout(struct work_struct *work)
sco_conn_lock(conn);
sk = conn->sk;
- if (sk)
+ if (sk) {
+ // lets pretend cpu has been busy (in interrupts) for 100ms
+ int i;
+ for (i=0;i<100000;i++)
+ udelay(1);
+
sock_hold(sk);
+ }
sco_conn_unlock(conn);
if (!sk)
Stack trace tells us that sco_sock_timeout() is running after last reference
on socket has been released.
__refcount_add include/linux/refcount.h:199 [inline]
__refcount_inc include/linux/refcount.h:250 [inline]
refcount_inc include/linux/refcount.h:267 [inline]
sock_hold include/net/sock.h:702 [inline]
sco_sock_timeout+0x216/0x290 net/bluetooth/sco.c:88
process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
This is why I suggested to delay sock_put() to make sure this can not happen.
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 98a88158651281c9f75c4e0371044251e976e7ef..bd0222e3f05a6bcb40cffe8405c9dfff98d7afde 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -195,10 +195,11 @@ static void sco_conn_del(struct hci_conn *hcon, int err)
sco_sock_clear_timer(sk);
sco_chan_del(sk, err);
release_sock(sk);
- sock_put(sk);
/* Ensure no more work items will run before freeing conn. */
cancel_delayed_work_sync(&conn->timeout_work);
+
+ sock_put(sk);
}
hcon->sco_data = NULL;
Powered by blists - more mailing lists