[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230621154337.1668594-1-edumazet@google.com>
Date: Wed, 21 Jun 2023 15:43:37 +0000
From: Eric Dumazet <edumazet@...gle.com>
To: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, eric.dumazet@...il.com,
Eric Dumazet <edumazet@...gle.com>, syzbot+a7d200a347f912723e5c@...kaller.appspotmail.com,
Johannes Berg <johannes.berg@...el.com>
Subject: [PATCH net] netlink: fix potential deadlock in netlink_set_err()
syzbot reported a possible deadlock in netlink_set_err() [1]
A similar issue was fixed in commit 1d482e666b8e ("netlink: disable IRQs
for netlink_lock_table()") in netlink_lock_table()
This patch adds IRQ safety to netlink_set_err() and __netlink_diag_dump()
which were not covered by cited commit.
[1]
WARNING: possible irq lock inversion dependency detected
6.4.0-rc6-syzkaller-00240-g4e9f0ec38852 #0 Not tainted
syz-executor.2/23011 just changed the state of lock:
ffffffff8e1a7a58 (nl_table_lock){.+.?}-{2:2}, at: netlink_set_err+0x2e/0x3a0 net/netlink/af_netlink.c:1612
but this lock was taken by another, SOFTIRQ-safe lock in the past:
(&local->queue_stop_reason_lock){..-.}-{2:2}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(nl_table_lock);
local_irq_disable();
lock(&local->queue_stop_reason_lock);
lock(nl_table_lock);
<Interrupt>
lock(&local->queue_stop_reason_lock);
*** DEADLOCK ***
Fixes: 1d482e666b8e ("netlink: disable IRQs for netlink_lock_table()")
Reported-by: syzbot+a7d200a347f912723e5c@...kaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=a7d200a347f912723e5c
Link: https://lore.kernel.org/netdev/000000000000e38d1605fea5747e@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Cc: Johannes Berg <johannes.berg@...el.com>
---
net/netlink/af_netlink.c | 5 +++--
net/netlink/diag.c | 5 +++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 3a1e0fd5bf149c3ece9e0f004107efe149e3f2c3..5968b6450d828a52a3990d18ea597687ab03170e 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1600,6 +1600,7 @@ static int do_one_set_err(struct sock *sk, struct netlink_set_err_data *p)
int netlink_set_err(struct sock *ssk, u32 portid, u32 group, int code)
{
struct netlink_set_err_data info;
+ unsigned long flags;
struct sock *sk;
int ret = 0;
@@ -1609,12 +1610,12 @@ int netlink_set_err(struct sock *ssk, u32 portid, u32 group, int code)
/* sk->sk_err wants a positive error value */
info.code = -code;
- read_lock(&nl_table_lock);
+ read_lock_irqsave(&nl_table_lock, flags);
sk_for_each_bound(sk, &nl_table[ssk->sk_protocol].mc_list)
ret += do_one_set_err(sk, &info);
- read_unlock(&nl_table_lock);
+ read_unlock_irqrestore(&nl_table_lock, flags);
return ret;
}
EXPORT_SYMBOL(netlink_set_err);
diff --git a/net/netlink/diag.c b/net/netlink/diag.c
index c6255eac305c7b64f6e00f10d3bf8cf42c680c15..4143b2ea4195aeacbd4eb828430c836cfb79d859 100644
--- a/net/netlink/diag.c
+++ b/net/netlink/diag.c
@@ -94,6 +94,7 @@ static int __netlink_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
struct net *net = sock_net(skb->sk);
struct netlink_diag_req *req;
struct netlink_sock *nlsk;
+ unsigned long flags;
struct sock *sk;
int num = 2;
int ret = 0;
@@ -152,7 +153,7 @@ static int __netlink_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
num++;
mc_list:
- read_lock(&nl_table_lock);
+ read_lock_irqsave(&nl_table_lock, flags);
sk_for_each_bound(sk, &tbl->mc_list) {
if (sk_hashed(sk))
continue;
@@ -173,7 +174,7 @@ static int __netlink_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
}
num++;
}
- read_unlock(&nl_table_lock);
+ read_unlock_irqrestore(&nl_table_lock, flags);
done:
cb->args[0] = num;
--
2.41.0.178.g377b9f9a00-goog
Powered by blists - more mailing lists