lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aHlXt3HBd--0JGqZ@xps>
Date: Thu, 17 Jul 2025 13:06:15 -0700
From: Xiang Mei <xmei5@....edu>
To: Dan Carpenter <dan.carpenter@...aro.org>
Cc: netdev@...r.kernel.org
Subject: Re: [bug report] net/sched: sch_qfq: Fix race condition on
 qfq_aggregate

On Thu, Jul 17, 2025 at 11:51:43AM -0500, Dan Carpenter wrote:
> Hello Xiang Mei,
> 
> Commit 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on
> qfq_aggregate") from Jul 10, 2025 (linux-next), leads to the
> following Smatch static checker warning:
> 
> 	net/sched/sch_generic.c:1107 qdisc_put()
> 	warn: sleeping in atomic context
> 
>    547  static int qfq_delete_class(struct Qdisc *sch, unsigned long arg,
>    548                              struct netlink_ext_ack *extack)
>    549  {
>    550          struct qfq_sched *q = qdisc_priv(sch);
>    551          struct qfq_class *cl = (struct qfq_class *)arg;
>    552  
>    553          if (qdisc_class_in_use(&cl->common)) {
>    554                  NL_SET_ERR_MSG_MOD(extack, "QFQ class in use");
>    555                  return -EBUSY;
>    556          }
>    557  
>    558          sch_tree_lock(sch);
>    559  
>    560          qdisc_purge_queue(cl->qdisc);
>    561          qdisc_class_hash_remove(&q->clhash, &cl->common);
>    562          qfq_destroy_class(sch, cl);
>                 ^^^^^^^^^^^^^^^^^
> We used to unlock first and then did the destroy but the patch moved
> this qfq_destroy_class() under the sch_tree_unlock() to solve a race
> condition.  Unfortunately, it introduces a sleeping in atomic context.
> 
>    563  
>    564          sch_tree_unlock(sch);
>    565  
>    566          return 0;
>    567  }
> 
> The call tree is:
> 
> qfq_delete_class() <- disables preempt
> -> qfq_destroy_class()
>    -> qdisc_put() <- sleeps
> 
> net/sched/sch_generic.c
>     1098 void qdisc_put(struct Qdisc *qdisc)
>     1099 {
>     1100         if (!qdisc)
>     1101                 return;
>     1102 
>     1103         if (qdisc->flags & TCQ_F_BUILTIN ||
>     1104             !refcount_dec_and_test(&qdisc->refcnt))
>     1105                 return;
>     1106 
> --> 1107         __qdisc_destroy(qdisc);
> 
> It's the lockdep_unregister_key() call which sleeps.
> 
>     1108 }
> 
> regards,
> dan carpenter
Thanks Dan for the explanations. 

What do you think about this solution: We split qfq_destory_class to two 
parts: qfq_rm_from_agg(q, cl) and the left calls. Since the race condition
is about agg, we can keep the left calls out of the lock but moving 
qfq_rm_from_agg into the lock.

This could avoid calling __qdisc_destroy in the lock. Please let me know 
if it works, I can help to deliever a new version of patch.

Best,
Xiang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ