[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <d943f806-4da6-4970-ac28-b9373b0e63ac@I-love.SAKURA.ne.jp>
Date: Sat, 20 Dec 2025 23:57:06 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: David Ahern <dsahern@...nel.org>, "David S. Miller"
<davem@...emloft.net>,
Kuniyuki Iwashima <kuniyu@...gle.com>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Network Development <netdev@...r.kernel.org>
Subject: [BUG nexthop] refcount leak in "struct nexthop" handling
syzbot is reporting refcount leak in "struct nexthop" handling
which manifests as a hung up with below message.
unregister_netdevice: waiting for lo to become free. Usage count = 2
ref_tracker: netdev@...f88803a65e618 has 1/1 users at
__netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
netdev_tracker_alloc include/linux/netdevice.h:4412 [inline]
netdev_get_by_index+0x7c/0xb0 net/core/dev.c:1008
fib6_nh_init+0x791/0x1fb0 net/ipv6/route.c:3590
nh_create_ipv6 net/ipv4/nexthop.c:2875 [inline]
nexthop_create net/ipv4/nexthop.c:2926 [inline]
nexthop_add net/ipv4/nexthop.c:2963 [inline]
rtm_new_nexthop+0x244b/0x87d0 net/ipv4/nexthop.c:3277
rtnetlink_rcv_msg+0x95e/0xe90 net/core/rtnetlink.c:6958
netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
__sys_sendmsg+0x16d/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Commit ab84be7e54fc ("net: Initial nexthop code") says
Nexthop notifications are sent when a nexthop is added or deleted,
but NOT if the delete is due to a device event or network namespace
teardown (which also involves device events).
which I guess that it is an intended behavior that
nexthop_notify(RTM_DELNEXTHOP) is not called from remove_nexthop() from
flush_all_nexthops() from nexthop_net_exit_rtnl() from ops_undo_list()
from cleanup_net() because remove_nexthop() passes nlinfo == NULL.
However, like the attached reproducer demonstrates, it is inevitable that
a userspace process terminates and network namespace teardown automatically
happens without explicitly invoking RTM_DELNEXTHOP request. The kernel is
not currently prepared for such scenario. How to fix this problem?
Link: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
View attachment "repro.c" of type "text/plain" (5370 bytes)
Powered by blists - more mailing lists