[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMkd87-6ZJ5PWsV8K+Pn+dVNEOP9NcfGAjXVrzAH70F4YA@mail.gmail.com>
Date: Thu, 22 May 2025 08:33:51 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: William Liu <will@...lsroot.io>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, Savy <savy@...t3mfailure.io>,
Cong Wang <xiyou.wangcong@...il.com>, Victor Nogueira <victor@...atatu.com>,
Pedro Tammela <pctammela@...atatu.com>, Paolo Abeni <pabeni@...hat.com>,
Jakub Kicinski <kuba@...nel.org>, Stephen Hemminger <stephen@...workplumber.org>,
Davide Caratti <dcaratti@...hat.com>
Subject: Re: [BUG] net/sched: Soft Lockup/Task Hang and OOM Loop in netem_dequeue
On Wed, May 21, 2025 at 11:38 AM William Liu <will@...lsroot.io> wrote:
>
> On Wednesday, May 21st, 2025 at 2:03 PM, William Liu <will@...lsroot.io> wrote:
>
> >
> >
> > On Wednesday, May 21st, 2025 at 11:06 AM, Jamal Hadi Salim jhs@...atatu.com wrote:
> >
> > > I am afraid using bits on the skb will not go well around here - and
> > > zero chance if you are using those bits for a one-off like netem.
> > > Take a look at net/sched/act_mirred.c for a more "acceptable"
> > > solution, see use of mirred_nest_level.
> >
> >
> > Ah ok, thank you for the suggestion. We will take a look at that then.
> >
> > > Not that it matters but I dont understand why you moved the skb2 check
> > > around. Was that resolving anything meaningful?
> > >
> > > cheers,
> > > jamal
> >
> >
> > Yes - otherwise the limit value of the qdisc isn't properly respected and can go over by one due to duplication. The limit check happens before both the duplication and the skb enqueue, so after duplication, that limit check would be stale for the original enqueue.
> >
> > Best,
> > Will
>
> Hi Jamal,
>
> If we do a per cpu global variable approach like in act_mirred.c to track nesting, then wouldn't this break in the case of having multiple normal qdisc setups run in parallel across multiple interfaces?
A single skb cannot enter netem via multiple cpus. You can have
multiple cpus entering the netem but they would be different skbs - am
i missing something? Note mirred uses per-cpu counter which should
suffice for being per skb counters.
> We also considered adding a nesting tracker within netem_sched_data itself (to increment and decrement in the prologue and epilogue netem_enqueue), but that would break upon having multiple netems with duplication enabled in the qdisc hierarchy. If we aren't going to track it in sk_buff, I am not sure if this approach is viable.
>
As long as it's per-cpu, not sure where the breakage is.
> This brings us back to the approach where we don't allow duplication in netem if a parent qdisc is a netem with duplication enabled. However, one issue we are worried about is in regards to qdisc_replace. This means this check would have to happen everytime we want to duplicate something in enqueue right? That isn't ideal either, but let me know if you know of a better place to add the check.
>
I didnt follow - can you be more specific?
cheers,
jamal
> Will
> Savy
Powered by blists - more mailing lists