[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250502005933.64039-1-kuniyu@amazon.com>
Date: Thu, 1 May 2025 17:53:22 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <syzbot+8dd1a8ebe4c3793f5aca@...kaller.appspotmail.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <edumazet@...gle.com>,
<horms@...nel.org>, <kuba@...nel.org>, <linux-kernel@...r.kernel.org>,
<netdev@...r.kernel.org>, <pabeni@...hat.com>,
<syzkaller-bugs@...glegroups.com>, <kuniyu@...zon.com>
Subject: Re: [syzbot] [net?] WARNING: suspicious RCU usage in __fib6_update_sernum_upto_root
From: syzbot <syzbot+8dd1a8ebe4c3793f5aca@...kaller.appspotmail.com>
Date: Thu, 01 May 2025 04:17:30 -0700
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: b6ea1680d0ac Merge tag 'v6.15-p6' of git://git.kernel.org/..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1457502f980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=a42a9d552788177b
> dashboard link: https://syzkaller.appspot.com/bug?extid=8dd1a8ebe4c3793f5aca
> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/8b0865e8a7ea/disk-b6ea1680.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/fab387b8c42a/vmlinux-b6ea1680.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/bfb50db06aa1/bzImage-b6ea1680.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+8dd1a8ebe4c3793f5aca@...kaller.appspotmail.com
>
> =============================
> WARNING: suspicious RCU usage
> 6.15.0-rc4-syzkaller-00042-gb6ea1680d0ac #0 Not tainted
> -----------------------------
> net/ipv6/ip6_fib.c:1351 suspicious rcu_dereference_protected() usage!
This is
struct fib6_node *fn = rcu_dereference_protected(rt->fib6_node,
lockdep_is_held(&rt->fib6_table->tb6_lock));
and ...
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 2, debug_locks = 1
> 3 locks held by syz.0.6334/23457:
> #0: ffffffff8f2e2008 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:80 [inline]
> #0: ffffffff8f2e2008 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock net/core/rtnetlink.c:341 [inline]
> #0: ffffffff8f2e2008 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x8db/0x1c70 net/core/rtnetlink.c:4064
> #1: ffffffff8df3b860 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
> #1: ffffffff8df3b860 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
> #1: ffffffff8df3b860 (rcu_read_lock){....}-{1:3}, at: __fib6_clean_all+0x9b/0x380 net/ipv6/ip6_fib.c:2263
> #2: ffff88807b4a5830 (&tb->tb6_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
> #2: ffff88807b4a5830 (&tb->tb6_lock){+.-.}-{3:3}, at: __fib6_clean_all+0x1ce/0x380 net/ipv6/ip6_fib.c:2267
>
> stack backtrace:
> CPU: 0 UID: 0 PID: 23457 Comm: syz.0.6334 Not tainted 6.15.0-rc4-syzkaller-00042-gb6ea1680d0ac #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/19/2025
> Call Trace:
> <TASK>
> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
> lockdep_rcu_suspicious+0x140/0x1d0 kernel/locking/lockdep.c:6865
> __fib6_update_sernum_upto_root+0x223/0x230 net/ipv6/ip6_fib.c:1350
> fib6_update_sernum_upto_root+0x125/0x190 net/ipv6/ip6_fib.c:1364
> fib6_ifup+0x142/0x180 net/ipv6/route.c:4818
> fib6_clean_node+0x24a/0x590 net/ipv6/ip6_fib.c:2199
> fib6_walk_continue+0x678/0x910 net/ipv6/ip6_fib.c:2124
> fib6_walk+0x149/0x290 net/ipv6/ip6_fib.c:2172
> fib6_clean_tree net/ipv6/ip6_fib.c:2252 [inline]
> __fib6_clean_all+0x234/0x380 net/ipv6/ip6_fib.c:2268
here we hold rcu_read_lock() and spin_lock_bh(&table->tb6_lock)..
so rt->fib6_table->tb6_lock is different from table->tb6_lock ??
fib6_link_table() is called without lock in fib6_tables_init(),
but it should be fine unless someone asynchronously triggeres a
thread that tries to access the main/local route table during
__net_init and races with fib6_tables_init() ?
In such a case, the possible fix would be
---8<---
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 1f860340690c..a26b8e9896f5 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -304,8 +304,10 @@ EXPORT_SYMBOL_GPL(fib6_get_table);
static void __net_init fib6_tables_init(struct net *net)
{
+ spin_lock_bh(&net->ipv6.fib_table_hash_lock);
fib6_link_table(net, net->ipv6.fib6_main_tbl);
fib6_link_table(net, net->ipv6.fib6_local_tbl);
+ spin_unlock_bh(&net->ipv6.fib_table_hash_lock);
}
#else
---8<---
but I'd like to wait for a repro.
> rt6_sync_up+0x128/0x160 net/ipv6/route.c:4837
> addrconf_notify+0xd55/0x1010 net/ipv6/addrconf.c:3729
> notifier_call_chain+0x1b3/0x3e0 kernel/notifier.c:85
> netif_state_change+0x284/0x3a0 net/core/dev.c:1530
> do_setlink+0x2eb6/0x40d0 net/core/rtnetlink.c:3399
> rtnl_group_changelink net/core/rtnetlink.c:3783 [inline]
> __rtnl_newlink net/core/rtnetlink.c:3937 [inline]
> rtnl_newlink+0x149f/0x1c70 net/core/rtnetlink.c:4065
> rtnetlink_rcv_msg+0x7cc/0xb70 net/core/rtnetlink.c:6955
> netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
> netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
> netlink_unicast+0x758/0x8d0 net/netlink/af_netlink.c:1339
> netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883
> sock_sendmsg_nosec net/socket.c:712 [inline]
> __sock_sendmsg+0x219/0x270 net/socket.c:727
> ____sys_sendmsg+0x505/0x830 net/socket.c:2566
> ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2620
> __sys_sendmsg net/socket.c:2652 [inline]
> __do_sys_sendmsg net/socket.c:2657 [inline]
> __se_sys_sendmsg net/socket.c:2655 [inline]
> __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2655
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0xf6/0x210 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
Powered by blists - more mailing lists