netdev - Re: [PATCH v5 net 6/6] net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZIAGxoRQIrLqZbc0@C02FL77VMD6R.googleapis.com>
Date: Tue, 6 Jun 2023 21:25:42 -0700
From: Peilin Ye <yepeilin.cs@...il.com>
To: Pedro Tammela <pctammela@...atatu.com>
Cc: Vlad Buslov <vladbu@...dia.com>, Jamal Hadi Salim <jhs@...atatu.com>,
	Jakub Kicinski <kuba@...nel.org>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>,
	Peilin Ye <peilin.ye@...edance.com>,
	Daniel Borkmann <daniel@...earbox.net>,
	John Fastabend <john.fastabend@...il.com>,
	Hillf Danton <hdanton@...a.com>, netdev@...r.kernel.org,
	Cong Wang <cong.wang@...edance.com>
Subject: Re: [PATCH v5 net 6/6] net/sched: qdisc_destroy() old ingress and
 clsact Qdiscs before grafting

Hi Pedro,

On Thu, Jun 01, 2023 at 10:03:22AM -0300, Pedro Tammela wrote:
> On 01/06/2023 00:57, Peilin Ye wrote:
> > -                       /* Replay if the current ingress (or clsact) Qdisc has ongoing
> > -                        * RTNL-unlocked filter request(s).  This is the counterpart of that
> > -                        * qdisc_refcount_inc_nz() call in __tcf_qdisc_find().
> > +                       /* If current ingress (clsact) Qdisc has ongoing filter requests, stop
> > +                        * accepting any more by marking it as "being destroyed", then tell the
> > +                        * caller to replay by returning -EAGAIN.
> >                           */
> > -                       if (!qdisc_refcount_dec_if_one(dev_queue->qdisc_sleeping))
> > +                       q = dev_queue->qdisc_sleeping;
> > +                       if (!qdisc_refcount_dec_if_one(q)) {
> > +                               q->flags |= TCQ_F_DESTROYING;
> > +                               rtnl_unlock();
> > +                               schedule();
>
> Was this intended or just a leftover?
> rtnl_lock() would reschedule if needed as it's a mutex_lock

In qdisc_create():

			rtnl_unlock();
			request_module("sch_%s", name);
			rtnl_lock();
			ops = qdisc_lookup_ops(kind);
			if (ops != NULL) {
				/* We will try again qdisc_lookup_ops,
				 * so don't keep a reference.
				 */
				module_put(ops->owner);
				err = -EAGAIN;
				goto err_out;

If qdisc_lookup_ops() returns !NULL, it means we've successfully loaded the
module, so we know that the next replay should (normally) succeed.

However in the diff I posted, if qdisc_refcount_dec_if_one() returned
false, we know that we should step back and wait a bit - especially when
there're ongoing RTNL-locked filter requests: we want them to grab the RTNL
mutex before us, which was my intention here, to unconditionally give up
the CPU.  I haven't tested if it really makes a difference though.

Thanks,
Peilin Ye