netdev - Re: [REGRESSION] xfrm issue bisected to 6471658dc66c ("udp: use skb_attempt_defer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CANn89iKN8Efr7VpW5g8Qu_3jZm+6LcvG+9EjZ286hWmF2FRwcQ@mail.gmail.com>
Date: Mon, 13 Oct 2025 15:12:59 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Michal Kubecek <mkubecek@...e.cz>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	Willem de Bruijn <willemb@...gle.com>, Kuniyuki Iwashima <kuniyu@...gle.com>, David Ahern <dsahern@...nel.org>, 
	netdev@...r.kernel.org, eric.dumazet@...il.com, 
	Steffen Klassert <steffen.klassert@...unet.com>, Herbert Xu <herbert@...dor.apana.org.au>
Subject: Re: [REGRESSION] xfrm issue bisected to 6471658dc66c ("udp: use skb_attempt_defer_free()")

On Mon, Oct 13, 2025 at 2:44 PM Michal Kubecek <mkubecek@...e.cz> wrote:
>
> On Tue, Sep 16, 2025 at 04:09:51PM GMT, Eric Dumazet wrote:
> > Move skb freeing from udp recvmsg() path to the cpu
> > which allocated/received it, as TCP did in linux-5.17.
> >
> > This increases max thoughput by 20% to 30%, depending
> > on number of BH producers.
> >
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > ---
>
> I encountered problems in 6.18-rc1 which were bisected to this patch,
> mainline commit 6471658dc66c ("udp: use skb_attempt_defer_free()").
>
> The way to reproduce is starting a ssh connection to a host which
> matches a security policy. The first problem seen in the log is hitting
> the check

Oops, thanks for the report. A secpath_reset() is missing I guess.

It is hard to believe we store skbs with expensive XFRM state in a
protocol receive queue.

This must have been a pretty high cost, even before my patch.

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 95241093b7f01b2dc31d9520b693f46400e545ff..dda944184dc2ae260de72a76c67038c20b0bae1b
100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1744,6 +1744,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk,
struct sk_buff *skb)

        atomic_add(size, &udp_prod_queue->rmem_alloc);

+       secpath_reset(skb);
        if (!llist_add(&skb->ll_node, &udp_prod_queue->ll_root))
                return 0;




>
>         WARN_ON(x->km.state != XFRM_STATE_DEAD);
>
> in __xfrm_state_destroy() with a stack like this:
>
> [  114.112830] Call Trace:
> [  114.112832]  <IRQ>
> [  114.112835]  __skb_ext_put+0x96/0xc0
> [  114.112840]  napi_consume_skb+0x42/0x110
> [  114.112842]  net_rx_action+0x14a/0x350
> [  114.112846]  ? __napi_schedule+0xb6/0xc0
> [  114.112848]  ? igb_msix_ring+0x6c/0x80 [igb 65a71327db3d237d6ebd4db22221016aa90703c9]
> [  114.112854]  handle_softirqs+0xca/0x270
> [  114.112858]  __irq_exit_rcu+0xbc/0xe0
> [  114.112860]  common_interrupt+0x85/0xa0
> [  114.112863]  </IRQ>
>
> After that, the system quickly becomes unusable, the immediate crash
> varies, often it's in a completely different part of kernel (e.g. amdgpu
> driver).
>
> Tomorrow I'll try reproducing with panic_on_warn so that I can get more
> information.
>
> Michal
>
> >  net/ipv4/udp.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 7d1444821ee51a19cd5fd0dd5b8d096104c9283c..0c40426628eb2306b609881341a51307c4993871 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1825,6 +1825,13 @@ void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len)
> >       if (unlikely(READ_ONCE(udp_sk(sk)->peeking_with_offset)))
> >               sk_peek_offset_bwd(sk, len);
> >
> > +     if (!skb_shared(skb)) {
> > +             if (unlikely(udp_skb_has_head_state(skb)))
> > +                     skb_release_head_state(skb);
> > +             skb_attempt_defer_free(skb);
> > +             return;
> > +     }
> > +
> >       if (!skb_unref(skb))
> >               return;
> >
> > --
> > 2.51.0.384.g4c02a37b29-goog
> >
> >