lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240513100246.85173-1-jianbol@nvidia.com>
Date: Mon, 13 May 2024 13:02:46 +0300
From: Jianbo Liu <jianbol@...dia.com>
To: <netdev@...r.kernel.org>, <edumazet@...gle.com>
CC: Jianbo Liu <jianbol@...dia.com>, Leon Romanovsky <leonro@...dia.com>
Subject: [PATCH net] net: drop secpath extension before skb deferral free

In commit 68822bdf76f1 ("net: generalize skb freeing deferral to
per-cpu lists"), skb can be queued on remote cpu list for deferral
free.

The remote cpu is kicked if the queue reaches half capacity. As
mentioned in the patch, this seems very unlikely to trigger
NET_RX_SOFTIRQ on the remote CPU in this way. But that seems not true,
we actually saw something that indicates this: skb is not freed
immediately, or even kept for a long time. And the possibility is
increased if there are more cpu cores.

As skb is not freed, its extension is not freed as well. An error
occurred while unloading the driver after running TCP traffic with
IPsec, where both crypto and packet were offloaded. However, in the
case of crypto offload, this failure was rare and significantly more
challenging to replicate.

 unregister_netdevice: waiting for eth2 to become free. Usage count = 2
 ref_tracker: eth%d@...000007421424b has 1/1 users at
      xfrm_dev_state_add+0xe5/0x4d0
      xfrm_add_sa+0xc5c/0x11e0
      xfrm_user_rcv_msg+0xfa/0x240
      netlink_rcv_skb+0x54/0x100
      xfrm_netlink_rcv+0x31/0x40
      netlink_unicast+0x1fc/0x2c0
      netlink_sendmsg+0x232/0x4a0
      __sock_sendmsg+0x38/0x60
      ____sys_sendmsg+0x1e3/0x200
      ___sys_sendmsg+0x80/0xc0
      __sys_sendmsg+0x51/0x90
      do_syscall_64+0x40/0xe0
      entry_SYSCALL_64_after_hwframe+0x46/0x4e

The ref_tracker shows the netdev is hold when the offloading xfrm
state is first added to hardware. When receiving packet, the secpath
extension, which saves xfrm state, is added to skb by ipsec offload,
and the xfrm state is hence hold by the received skb. It can't be
flushed till skb is dequeued from the defer list, then skb and its
extension are really freed. Also, the netdev can't be unregistered
because it still referred by xfrm state.

To fix this issue, drop this extension before skb is queued to the
defer list, so xfrm state destruction is not blocked.

Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
Signed-off-by: Jianbo Liu <jianbol@...dia.com>
Reviewed-by: Leon Romanovsky <leonro@...dia.com>
---
 net/core/skbuff.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b99127712e67..d7f5024f3c08 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -7025,6 +7025,10 @@ nodefer:	__kfree_skb(skb);
 	if (READ_ONCE(sd->defer_count) >= defer_max)
 		goto nodefer;
 
+#ifdef CONFIG_XFRM
+	skb_ext_del(skb, SKB_EXT_SEC_PATH);
+#endif
+
 	spin_lock_bh(&sd->defer_lock);
 	/* Send an IPI every time queue reaches half capacity. */
 	kick = sd->defer_count == (defer_max >> 1);
-- 
2.38.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ