[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iK7Nf9WPwg3XkwAMfCqnidtjsB9fSr3025rsUnpuwXJ2w@mail.gmail.com>
Date: Wed, 6 Dec 2023 11:25:52 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Davide Caratti <dcaratti@...hat.com>
Cc: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
xmu@...hat.com, cpaasch@...le.com
Subject: Re: [PATCH net-next] net/sched: fix false lockdep warning on qdisc
root lock
On Wed, Dec 6, 2023 at 11:16 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Wed, Dec 6, 2023 at 10:04 AM Davide Caratti <dcaratti@...hat.com> wrote:
> >
> > Xiumei and Cristoph reported the following lockdep splat, it complains of
> > the qdisc root being taken twice:
> >
> > ============================================
> > WARNING: possible recursive locking detected
> > 6.7.0-rc3+ #598 Not tainted
> > --------------------------------------------
> > swapper/2/0 is trying to acquire lock:
> > ffff888177190110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70
> >
> > but task is already holding lock:
> > ffff88811995a110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70
> >
> > other info that might help us debug this:
> > Possible unsafe locking scenario:
> >
> > CPU0
> > ----
> > lock(&sch->q.lock);
> > lock(&sch->q.lock);
> >
> > *** DEADLOCK ***
> >
> > May be due to missing lock nesting notation
> >
> > 5 locks held by swapper/2/0:
> > #0: ffff888135a09d98 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x11a/0x510
> > #1: ffffffffaaee5260 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x2c0/0x1ed0
> > #2: ffffffffaaee5200 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x209/0x2e70
> > #3: ffff88811995a110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70
> > #4: ffffffffaaee5200 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x209/0x2e70
> >
> >
>
> Can you add a Fixes: tag ?
>
> Also, what is the interaction with htb_set_lockdep_class_child(), have
> you tried to use HTB after your patch ?
>
> Could htb_set_lockdep_class_child() be removed ?
>
>
> > CC: Xiumei Mu <xmu@...hat.com>
> > Reported-by: Cristoph Paasch <cpaasch@...le.com>
> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/451
> > Signed-off-by: Davide Caratti <dcaratti@...hat.com>
> > ---
> > include/net/sch_generic.h | 1 +
> > net/sched/sch_generic.c | 3 +++
> > 2 files changed, 4 insertions(+)
> >
> > diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> > index dcb9160e6467..a395ca76066c 100644
> > --- a/include/net/sch_generic.h
> > +++ b/include/net/sch_generic.h
> > @@ -126,6 +126,7 @@ struct Qdisc {
> >
> > struct rcu_head rcu;
> > netdevice_tracker dev_tracker;
> > + struct lock_class_key root_lock_key;
> > /* private data */
> > long privdata[] ____cacheline_aligned;
> > };
> > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> > index 8dd0e5925342..da3e1ea42852 100644
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -944,7 +944,9 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
> > __skb_queue_head_init(&sch->gso_skb);
> > __skb_queue_head_init(&sch->skb_bad_txq);
> > gnet_stats_basic_sync_init(&sch->bstats);
> > + lockdep_register_key(&sch->root_lock_key);
> > spin_lock_init(&sch->q.lock);
> > + lockdep_set_class(&sch->q.lock, &sch->root_lock_key);
> >
> > if (ops->static_flags & TCQ_F_CPUSTATS) {
> > sch->cpu_bstats =
> > @@ -1064,6 +1066,7 @@ static void __qdisc_destroy(struct Qdisc *qdisc)
> > if (ops->destroy)
> > ops->destroy(qdisc);
> >
> > + lockdep_unregister_key(&qdisc->root_lock_key);
lockdep_unregister_key() has a synchronize_rcu() call.
This would slow down qdisc dismantle too much.
I think we need to find another solution to this problem.
Powered by blists - more mailing lists