[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aHlXt3HBd--0JGqZ@xps>
Date: Thu, 17 Jul 2025 13:06:15 -0700
From: Xiang Mei <xmei5@....edu>
To: Dan Carpenter <dan.carpenter@...aro.org>
Cc: netdev@...r.kernel.org
Subject: Re: [bug report] net/sched: sch_qfq: Fix race condition on
qfq_aggregate
On Thu, Jul 17, 2025 at 11:51:43AM -0500, Dan Carpenter wrote:
> Hello Xiang Mei,
>
> Commit 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on
> qfq_aggregate") from Jul 10, 2025 (linux-next), leads to the
> following Smatch static checker warning:
>
> net/sched/sch_generic.c:1107 qdisc_put()
> warn: sleeping in atomic context
>
> 547 static int qfq_delete_class(struct Qdisc *sch, unsigned long arg,
> 548 struct netlink_ext_ack *extack)
> 549 {
> 550 struct qfq_sched *q = qdisc_priv(sch);
> 551 struct qfq_class *cl = (struct qfq_class *)arg;
> 552
> 553 if (qdisc_class_in_use(&cl->common)) {
> 554 NL_SET_ERR_MSG_MOD(extack, "QFQ class in use");
> 555 return -EBUSY;
> 556 }
> 557
> 558 sch_tree_lock(sch);
> 559
> 560 qdisc_purge_queue(cl->qdisc);
> 561 qdisc_class_hash_remove(&q->clhash, &cl->common);
> 562 qfq_destroy_class(sch, cl);
> ^^^^^^^^^^^^^^^^^
> We used to unlock first and then did the destroy but the patch moved
> this qfq_destroy_class() under the sch_tree_unlock() to solve a race
> condition. Unfortunately, it introduces a sleeping in atomic context.
>
> 563
> 564 sch_tree_unlock(sch);
> 565
> 566 return 0;
> 567 }
>
> The call tree is:
>
> qfq_delete_class() <- disables preempt
> -> qfq_destroy_class()
> -> qdisc_put() <- sleeps
>
> net/sched/sch_generic.c
> 1098 void qdisc_put(struct Qdisc *qdisc)
> 1099 {
> 1100 if (!qdisc)
> 1101 return;
> 1102
> 1103 if (qdisc->flags & TCQ_F_BUILTIN ||
> 1104 !refcount_dec_and_test(&qdisc->refcnt))
> 1105 return;
> 1106
> --> 1107 __qdisc_destroy(qdisc);
>
> It's the lockdep_unregister_key() call which sleeps.
>
> 1108 }
>
> regards,
> dan carpenter
Thanks Dan for the explanations.
What do you think about this solution: We split qfq_destory_class to two
parts: qfq_rm_from_agg(q, cl) and the left calls. Since the race condition
is about agg, we can keep the left calls out of the lock but moving
qfq_rm_from_agg into the lock.
This could avoid calling __qdisc_destroy in the lock. Please let me know
if it works, I can help to deliever a new version of patch.
Best,
Xiang
Powered by blists - more mailing lists