[<prev] [next>] [day] [month] [year] [list]
Message-ID: <1de734e2-1da6-4b5c-8e03-66af7f88d074@I-love.SAKURA.ne.jp>
Date: Wed, 28 Jan 2026 00:32:08 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: Steffen Klassert <steffen.klassert@...unet.com>,
Herbert Xu <herbert@...dor.apana.org.au>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Ilan Tayari <ilant@...lanox.com>, Guy Shapiro <guysh@...lanox.com>,
Yossi Kuperman <yossiku@...lanox.com>,
Network Development <netdev@...r.kernel.org>,
Sabrina Dubroca <sd@...asysnail.net>,
Leon Romanovsky <leon@...nel.org>
Subject: [PATCH net] xfrm: always flush state and policy upon
NETDEV_DOWN/NETDEV_UNREGISTER events
syzbot is reporting that "struct xfrm_state" refcount is leaking.
unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
ref_tracker: netdev@...f888052f24618 has 1/1 users at
__netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
netdev_tracker_alloc include/linux/netdevice.h:4412 [inline]
xfrm_dev_state_add+0x3a5/0x1080 net/xfrm/xfrm_device.c:316
xfrm_state_construct net/xfrm/xfrm_user.c:986 [inline]
xfrm_add_sa+0x34ff/0x5fa0 net/xfrm/xfrm_user.c:1022
xfrm_user_rcv_msg+0x58e/0xc00 net/xfrm/xfrm_user.c:3507
netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
xfrm_netlink_rcv+0x71/0x90 net/xfrm/xfrm_user.c:3529
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
__sys_sendmsg+0x16d/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
This is because currently xfrm_dev_down() (which is called upon NETDEV_DOWN
event and NETDEV_UNREGISTER event) can release the reference to
"struct net_device" taken by xfrm_dev_state_add() only if
the NETIF_F_HW_ESP bit is set, but the NETIF_F_HW_ESP bit can be cleared
by "ethtool -K $dev esp-hw-offload off" command.
In other words, we cannot guess whether xfrm_dev_state_add() has taken a
reference to "struct net_device" based on whether the NETIF_F_HW_ESP bit
is currently set.
For recording why this patch does not re-introduce xfrm_dev_unregister(),
my guessed history about this module's NETDEV_UNREGISTER handler is shown
below.
Commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
introduced xfrm_dev_state_add(). That commit called xfrm_dev_state_add()
from xfrm_state_construct(), and introduced the NETDEV_UNREGISTER case
to xfrm_dev_event(). But that commit implemented xfrm_dev_unregister() as
a no-op, and implemented xfrm_dev_down() to call xfrm_dev_state_flush()
only if (dev->features & NETIF_F_HW_ESP) != 0. I guess that that commit
expected that NETDEV_DOWN event is fired before NETDEV_UNREGISTER event
fires, and also assumed that xfrm_dev_state_add() is called only if
(dev->features & NETIF_F_HW_ESP) != 0.
Commit ec30d78c14a8 ("xfrm: add xdst pcpu cache") added
xfrm_policy_cache_flush() call to xfrm_dev_unregister(), but
commit e4db5b61c572 ("xfrm: policy: remove pcpu policy cache") removed
xfrm_policy_cache_flush() call from xfrm_dev_unregister() and also
removed the NETDEV_UNREGISTER case from xfrm_dev_event() because
xfrm_dev_unregister() again became no-op.
Commit 03891f820c21 ("xfrm: handle NETDEV_UNREGISTER for xfrm device")
re-introduced the NETDEV_UNREGISTER case to xfrm_dev_event(), but that
commit for unknown reason chose to share xfrm_dev_down() between the
NETDEV_DOWN case and the NETDEV_UNREGISTER case. Therefore, I assumed
that doing the same behavior for both cases is desirable. If something
is wrong with this choice, please re-introduce xfrm_dev_unregister().
Reported-by: syzbot+881d65229ca4f9ae8c84@...kaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
---
Previous discussion is https://lkml.kernel.org/r/924f9cf5-599a-48f0-b1e3-94cd971965b0@I-love.SAKURA.ne.jp
net/xfrm/xfrm_device.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 52ae0e034d29..26e62b6a9db5 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -536,10 +536,8 @@ static int xfrm_api_check(struct net_device *dev)
static int xfrm_dev_down(struct net_device *dev)
{
- if (dev->features & NETIF_F_HW_ESP) {
- xfrm_dev_state_flush(dev_net(dev), dev, true);
- xfrm_dev_policy_flush(dev_net(dev), dev, true);
- }
+ xfrm_dev_state_flush(dev_net(dev), dev, true);
+ xfrm_dev_policy_flush(dev_net(dev), dev, true);
return NOTIFY_DONE;
}
--
2.47.3
Powered by blists - more mailing lists