netdev - Re: [PATCH v5 net 6/6] net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAM0EoM=xLkAr5EF7bty+ETmZ3GXnmB9De3fYSCrQjKPb8qDy7Q@mail.gmail.com>
Date: Sun, 28 May 2023 14:54:29 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Peilin Ye <yepeilin.cs@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Pedro Tammela <pctammela@...atatu.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Paolo Abeni <pabeni@...hat.com>, Cong Wang <xiyou.wangcong@...il.com>, 
	Jiri Pirko <jiri@...nulli.us>, Peilin Ye <peilin.ye@...edance.com>, 
	Daniel Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>, 
	Hillf Danton <hdanton@...a.com>, netdev@...r.kernel.org, 
	Cong Wang <cong.wang@...edance.com>, Vlad Buslov <vladbu@...dia.com>
Subject: Re: [PATCH v5 net 6/6] net/sched: qdisc_destroy() old ingress and
 clsact Qdiscs before grafting

On Sat, May 27, 2023 at 4:23 AM Peilin Ye <yepeilin.cs@...il.com> wrote:
>
> Hi Jakub and all,
>
> On Fri, May 26, 2023 at 07:33:24PM -0700, Jakub Kicinski wrote:
> > On Fri, 26 May 2023 16:09:51 -0700 Peilin Ye wrote:
> > > Thanks a lot, I'll get right on it.
> >
> > Any insights? Is it just a live-lock inherent to the retry scheme
> > or we actually forget to release the lock/refcnt?
>
> I think it's just a thread holding the RTNL mutex for too long (replaying
> too many times).  We could replay for arbitrary times in
> tc_{modify,get}_qdisc() if the user keeps sending RTNL-unlocked filter
> requests for the old Qdisc.
>
> I tested the new reproducer Pedro posted, on:
>
> 1. All 6 v5 patches, FWIW, which caused a similar hang as Pedro reported
>
> 2. First 5 v5 patches, plus patch 6 in v1 (no replaying), did not trigger
>    any issues (in about 30 minutes).
>
> 3. All 6 v5 patches, plus this diff:
>
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index 286b7c58f5b9..988718ba5abe 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -1090,8 +1090,11 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
>                          * RTNL-unlocked filter request(s).  This is the counterpart of that
>                          * qdisc_refcount_inc_nz() call in __tcf_qdisc_find().
>                          */
> -                       if (!qdisc_refcount_dec_if_one(dev_queue->qdisc_sleeping))
> +                       if (!qdisc_refcount_dec_if_one(dev_queue->qdisc_sleeping)) {
> +                               rtnl_unlock();
> +                               rtnl_lock();
>                                 return -EAGAIN;
> +                       }
>                 }
>
>                 if (dev->flags & IFF_UP)
>
>    Did not trigger any issues (in about 30 mintues) either.
>
> What would you suggest?


I am more worried it is a wackamole situation. We fixed the first
reproducer with essentially patches 1-4 but we opened a new one which
the second reproducer catches. One thing the current reproducer does
is create a lot rtnl contention in the beggining by creating all those
devices and then after it is just creating/deleting qdisc and doing
update with flower where such contention is reduced. i.e it may just
take longer for the mole to pop up.

Why dont we push the V1 patch in and then worry about getting clever
with EAGAIN after? Can you test the V1 version with the repro Pedro
posted? It shouldnt have these issues. Also it would be interesting to
see how performance of the parallel updates to flower is affected.

cheers,
jamal

> Thanks,
> Peilin Ye
>