lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMmP5SBzhoKGGxfdkfvMEZ0nFCiKNJ8hBa4L-0WTCqC5Ww@mail.gmail.com>
Date: Tue, 8 Jul 2025 16:35:37 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: William Liu <will@...lsroot.io>, netdev@...r.kernel.org, victor@...atatu.com, 
	pctammela@...atatu.com, pabeni@...hat.com, kuba@...nel.org, 
	stephen@...workplumber.org, dcaratti@...hat.com, savy@...t3mfailure.io, 
	jiri@...nulli.us, davem@...emloft.net, edumazet@...gle.com, horms@...nel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: This breaks netem use cases

On Tue, Jul 8, 2025 at 3:42 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
>
> (Cc LKML for more audience, since this clearly breaks potentially useful
> use cases)
>
> On Tue, Jul 08, 2025 at 04:43:26PM +0000, William Liu wrote:
> > netem_enqueue's duplication prevention logic breaks when a netem
> > resides in a qdisc tree with other netems - this can lead to a
> > soft lockup and OOM loop in netem_dequeue, as seen in [1].
> > Ensure that a duplicating netem cannot exist in a tree with other
> > netems.
>
> As I already warned in your previous patchset, this breaks the following
> potentially useful use case:
>
> sudo tc qdisc add dev eth0 root handle 1: mq
> sudo tc qdisc add dev eth0 parent 1:1 handle 10: netem duplicate 100%
> sudo tc qdisc add dev eth0 parent 1:2 handle 20: netem duplicate 100%
>
> I don't see any logical problem of such use case, therefore we should
> consider it as valid, we can't break it.
>

I thought we are trying to provide an intermediate solution to plug an
existing hole and come up with a longer term solution.
If there are users of such a "potential setup" you show above we are
going to find out very quickly.
We are working against security people who are finding all sorts of
"potential use cases" to create CVEs.

> >
> > Previous approaches suggested in discussions in chronological order:
> >
> > 1) Track duplication status or ttl in the sk_buff struct. Considered
> > too specific a use case to extend such a struct, though this would
> > be a resilient fix and address other previous and potential future
> > DOS bugs like the one described in loopy fun [2].
>
> The link you provid is from 8 years ago, since then the redirection
> logic has been improved. I am not sure why it helps to justify your
> refusal of this approach.
>
> I also strongly disagree with "too specific a use case to extend such
> a struct", we simply have so many use-case-specific fields within
> sk_buff->cb. For example, the tc_skb_cb->zone is very specific
> for act_ct.
>
> skb->cb is precisely designed to be use-case-specific and layer-specific.
>
> None of the above points stands.
>

I doubt you have looked at the code based on how you keep coming back
with the same points.

> >
> > 2) Restrict netem_enqueue recursion depth like in act_mirred with a
> > per cpu variable. However, netem_dequeue can call enqueue on its
> > child, and the depth restriction could be bypassed if the child is a
> > netem.
> >
> > 3) Use the same approach as in 2, but add metadata in netem_skb_cb
> > to handle the netem_dequeue case and track a packet's involvement
> > in duplication. This is an overly complex approach, and Jamal
> > notes that the skb cb can be overwritten to circumvent this
> > safeguard.
>
> This is not true, except qdisc_skb_cb(skb)->data, other area of
> skb->cb is preserved within Qdisc layer.
>

Your approach has issues as i pointed out. At minimum invest the time
please and look at code.
I am certain you could keep changing other code outside of netem and
fix all issues you are exposing.
We agreed this is for a short term solution and needs to be contained
on just netem, what is the point of this whole long discussion really?
We have spent over a month already..


cheers,
jamal

> Based on the above reasoning, this is clearly no way to go:
>
> NACK-by: Cong Wang <xiyou.wangcong@...il.com>
>
> Sorry for standing firmly for the users, we simply don't break use
> cases. This is nothing personal, just a firm principle.
>
> Please let me know if there is anything else I can help you with. I am
> always ready to help (but not in a way of breaking use cases).
>
> Thanks for your understanding!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ