[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20240321222054.2462-1-hdanton@sina.com>
Date: Fri, 22 Mar 2024 06:20:54 +0800
From: Hillf Danton <hdanton@...a.com>
To: Antoine Tenart <atenart@...nel.org>
Cc: linux-kernel@...r.kernel.org,
netdev@...r.kernel.org,
pabeni@...hat.com,
syzkaller-bugs@...glegroups.com,
Eric Dumazet <edumazet@...gle.com>,
syzbot <syzbot+99b8125966713aa4b0c3@...kaller.appspotmail.com>
Subject: Re: [syzbot] [net?] INFO: task hung in register_nexthop_notifier (3)
On Thu, 21 Mar 2024 10:22:25 +0100 Antoine Tenart <atenart@...nel.org>
> Quoting Eric Dumazet (2024-03-18 15:46:37)
> > On Mon, Mar 18, 2024 at 12:26=E2=80=AFPM syzbot
> > <syzbot+99b8125966713aa4b0c3@...kaller.appspotmail.com> wrote:
> > >
> > > INFO: task syz-executor.3:6975 blocked for more than 143 seconds.
> > > Not tainted 6.8.0-rc7-syzkaller-02500-g76839e2f1fde #0
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messag=
> e.
> > > task:syz-executor.3 state:D stack:20920 pid:6975 tgid:6975 ppid:1 =
> flags:0x00004006
> > > Call Trace:
> > > <TASK>
> > > context_switch kernel/sched/core.c:5400 [inline]
> > > __schedule+0x17d1/0x49f0 kernel/sched/core.c:6727
> > > __schedule_loop kernel/sched/core.c:6802 [inline]
> > > schedule+0x149/0x260 kernel/sched/core.c:6817
> > > schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6874
> > > __mutex_lock_common kernel/locking/mutex.c:684 [inline]
> > > __mutex_lock+0x6a3/0xd70 kernel/locking/mutex.c:752
> > > register_nexthop_notifier+0x84/0x290 net/ipv4/nexthop.c:3863
> > > nsim_fib_create+0x8a6/0xa70 drivers/net/netdevsim/fib.c:1587
> > > nsim_drv_probe+0x747/0xb80 drivers/net/netdevsim/dev.c:1582
> > > really_probe+0x29e/0xc50 drivers/base/dd.c:658
> > > __driver_probe_device+0x1a2/0x3e0 drivers/base/dd.c:800
> > > driver_probe_device+0x50/0x430 drivers/base/dd.c:830
> > > __device_attach_driver+0x2d6/0x530 drivers/base/dd.c:958
> > > bus_for_each_drv+0x24e/0x2e0 drivers/base/bus.c:457
> > > __device_attach+0x333/0x520 drivers/base/dd.c:1030
> > > bus_probe_device+0x189/0x260 drivers/base/bus.c:532
> > > device_add+0x8ff/0xca0 drivers/base/core.c:3639
> > > nsim_bus_dev_new drivers/net/netdevsim/bus.c:442 [inline]
> > > new_device_store+0x3f2/0x890 drivers/net/netdevsim/bus.c:173
> > > kernfs_fop_write_iter+0x3a4/0x500 fs/kernfs/file.c:334
> >=20
> > So we have a sysfs handler ultimately calling register_nexthop_notifier()=
> or any
> > other network control path requiring RTNL.
> >=20
> > Note that we have rtnl_trylock() for a reason...
>
> Mentioning the below in case that gives some ideas; feel free to
> disregard.
>
> When I looked at similar issues a while ago the rtnl deadlock actually
> happened with the kernfs_node refcount; haven't looked at this one in
> details though. The mutex in there was just preventing concurrent
> writers.
>
> > Or maybe the reason is wrong, if we could change kernfs_fop_write_iter()
> > to no longer hold a mutex...
Better after working out why RCU stalled [1]
5 locks held by kworker/u4:7/23559:
#0: ffff888015ea4938 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2608 [inline]
#0: ffff888015ea4938 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2706
#1: ffffc90012b8fd20 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2608 [inline]
#1: ffffc90012b8fd20 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2706
#2: ffffffff8f36d250 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:591
#3: ffffffff8f3798c8 (rtnl_mutex){+.+.}-{3:3}, at: cleanup_net+0x6af/0xcc0 net/core/net_namespace.c:627
#4: ffffffff8e136440 (rcu_state.barrier_mutex){+.+.}-{3:3}, at: rcu_barrier+0x4c/0x550 kernel/rcu/tree.c:4064
[1] https://lore.kernel.org/lkml/0000000000009485160613eda067@google.com/
>
> At the time I found a way to safely drop the refcount of those
> kernfs_node which then allowed to call rtnl_lock from sysfs handlers,
> https://lore.kernel.org/all/20231018154804.420823-1-atenart@kernel.org/T/
>
> Note that this relied on how net device are unregistered (calling
> device_del under rtnl and later waiting for refs on the netdev to drop
> outside of the lock; and a few other things), so extra modifications
> would be needed to generalize the approach. Also it's a tradeoff between
> fixing those deadlocks without rtnl_trylock and maintaining a quite
> complex logic...
>
> Antoine
>
Powered by blists - more mailing lists