lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <ZHG+AR8qgpJ6/Zhx@C02FL77VMD6R.googleapis.com> Date: Sat, 27 May 2023 01:23:29 -0700 From: Peilin Ye <yepeilin.cs@...il.com> To: Jakub Kicinski <kuba@...nel.org>, Jamal Hadi Salim <jhs@...atatu.com>, Pedro Tammela <pctammela@...atatu.com> Cc: Pedro Tammela <pctammela@...atatu.com>, Jamal Hadi Salim <jhs@...atatu.com>, "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>, Peilin Ye <peilin.ye@...edance.com>, Daniel Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>, Hillf Danton <hdanton@...a.com>, netdev@...r.kernel.org, Cong Wang <cong.wang@...edance.com>, Vlad Buslov <vladbu@...dia.com> Subject: Re: [PATCH v5 net 6/6] net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting Hi Jakub and all, On Fri, May 26, 2023 at 07:33:24PM -0700, Jakub Kicinski wrote: > On Fri, 26 May 2023 16:09:51 -0700 Peilin Ye wrote: > > Thanks a lot, I'll get right on it. > > Any insights? Is it just a live-lock inherent to the retry scheme > or we actually forget to release the lock/refcnt? I think it's just a thread holding the RTNL mutex for too long (replaying too many times). We could replay for arbitrary times in tc_{modify,get}_qdisc() if the user keeps sending RTNL-unlocked filter requests for the old Qdisc. I tested the new reproducer Pedro posted, on: 1. All 6 v5 patches, FWIW, which caused a similar hang as Pedro reported 2. First 5 v5 patches, plus patch 6 in v1 (no replaying), did not trigger any issues (in about 30 minutes). 3. All 6 v5 patches, plus this diff: diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 286b7c58f5b9..988718ba5abe 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1090,8 +1090,11 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, * RTNL-unlocked filter request(s). This is the counterpart of that * qdisc_refcount_inc_nz() call in __tcf_qdisc_find(). */ - if (!qdisc_refcount_dec_if_one(dev_queue->qdisc_sleeping)) + if (!qdisc_refcount_dec_if_one(dev_queue->qdisc_sleeping)) { + rtnl_unlock(); + rtnl_lock(); return -EAGAIN; + } } if (dev->flags & IFF_UP) Did not trigger any issues (in about 30 mintues) either. What would you suggest? Thanks, Peilin Ye
Powered by blists - more mailing lists