[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mz8HnpeShmNHFgeE6yoGG_gb5l1mHqvNee9aRtGX6yTz5zDvf2I4U1wKtH9k5qkfz0SfUfhsonzrzDSgvyM9vRRyvktvYwtTHvfmcZK_Sp8=@willsroot.io>
Date: Thu, 27 Nov 2025 02:09:58 +0000
From: William Liu <will@...lsroot.io>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: netdev@...r.kernel.org, stephen@...workplumber.org, kuba@...nel.org, Savino Dicanosa <savy@...t3mfailure.io>, Jamal Hadi Salim <jhs@...atatu.com>
Subject: Re: [Patch net v5 3/9] net_sched: Implement the right netem duplication behavior
On Wednesday, November 26th, 2025 at 11:13 PM, Cong Wang <xiyou.wangcong@...il.com> wrote:
>
>
> On Wed, Nov 26, 2025 at 10:43:07PM +0000, William Liu wrote:
>
> > > If you have a better standard than man page, please kindly point it out.
> > > I am happy to follow.
> > >
> > > I think we both agree it should not be either my standard or anyone's
> > > personal stardard, this is why I use man page as a neutral and reasonable
> > > stardard.
> > >
> > > If you disagree man page is reasonable, please offer a better one for me
> > > to follow. I am very open, I just simply don't know anything better than
> > > man page.
> >
> > I agree that your change does not violate manpage semantics. This was the original fix I suggested from the beginning, though other maintainers pointed out the issue that I am relaying.
> >
> > As I wrote in my previous email, "as both Jamal and Stephen have pointed out, this breaks expected user behavior as well, and the enqueuing at root was done for the sake of proper accounting and rate limit semantics."
> >
> > The previous netem fix changed user behavior that did not violate the manpage (to my knowledge). This one is the same - you are fixing one user behavior break with another. Both are cases of Hyrum's law.
>
>
> They are two different things here:
>
> 1) The behavior of "duplicate" option of netem, which is already
> documented in the man page. This is why I use man page as the standard
> to follow.
>
> 2) There are infinite combinations of TC components, obviously, it is
> impossible to document all the combinations. This is also why I don't
> think Victor's patch could fix all of them, it is a simple known
> unknown.
>
> For 1), the documented behavior is not violated by my patch, as you
> agreed.
>
> For 2), there is no known valid combination broken by this patch. At
> least not the well-known mq+netem combination.
>
> I am open to be wrong, but no one could even provide any specific case so
> far, people just keep talking with speculations, so unfortunately there is
> no action I can take with pure speculations.
>
> I hope this now makes better sense to you.
>
> > > Sorry for my ignorance. Please help me out. :)
> > >
> > > > Jamal suggested a really reasonable fix with tc_skb_ext - can we please take a look at its soundness and attempt that approach? No user behavior would be affected in that case.
> > >
> > > As I already explained, tc_skb_ext is for cross-layer, in this specific
> > > case, we don't cross layers, the skb is immediately queued to the same
> > > layer before others.
> > >
> > > Could you please kindly explain why you still believe tc_skb_ext is
> > > better? I am very open to your thoughts, please enlighten me here.
> >
> > Yes, if we re-enqueue the packet to the same netem qdisc, we don't need this, but that changes expected user behavior and may introduce additional correctness issues pointed out above.
>
>
> Again, it does not violate the man page. What standard are you referring
> to when you say "expected user behavior"? Please kindly point me to the
> standard you refer here, I am happy to look into it.
I meant long-time existing user-observable behavior (since 2005).
>
> > If understood Jamal correctly, tc_skb_ext allows us to maintain both the re-entrant at root behavior AND prevent DOS.
>
>
> No, the whole point of this patch is to change this problematic
> behavior, without violating man page.
>
If that's the case, I will defer to other maintainers then.
FWIW, the commit (0afb51e72855) that introduced this mentioned that the current behavior helps "avoid problems with qlen accounting with nested qdisc."
If you were just trying to fix the bug, then a fix that prevents DOS and changes no existing observable behavior is better imo.
> > I hope you can understand I am trying to relay problems other maintainers have pointed out repeatedly; I personally don't have a strong stake in this.
>
>
> Your independent thoughts are welcome, no one is absolutely right, there
> is no one you need to follow or relay.
>
> BTW, I already responed to them. Please let me know how I can be even more
> clear.
>
> Regards,
> Cong
Best,
Will
Powered by blists - more mailing lists