[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <gpjh4lrotyephiqpuldtxxizrsg6job7cvhiqrw72saz2ubs3h@g6fgbvexgl3r>
Date: Mon, 13 Oct 2025 23:44:32 +0200
From: Michal Kubecek <mkubecek@...e.cz>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Willem de Bruijn <willemb@...gle.com>, Kuniyuki Iwashima <kuniyu@...gle.com>,
David Ahern <dsahern@...nel.org>, netdev@...r.kernel.org, eric.dumazet@...il.com,
Steffen Klassert <steffen.klassert@...unet.com>, Herbert Xu <herbert@...dor.apana.org.au>
Subject: [REGRESSION] xfrm issue bisected to 6471658dc66c ("udp: use
skb_attempt_defer_free()")
On Tue, Sep 16, 2025 at 04:09:51PM GMT, Eric Dumazet wrote:
> Move skb freeing from udp recvmsg() path to the cpu
> which allocated/received it, as TCP did in linux-5.17.
>
> This increases max thoughput by 20% to 30%, depending
> on number of BH producers.
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> ---
I encountered problems in 6.18-rc1 which were bisected to this patch,
mainline commit 6471658dc66c ("udp: use skb_attempt_defer_free()").
The way to reproduce is starting a ssh connection to a host which
matches a security policy. The first problem seen in the log is hitting
the check
WARN_ON(x->km.state != XFRM_STATE_DEAD);
in __xfrm_state_destroy() with a stack like this:
[ 114.112830] Call Trace:
[ 114.112832] <IRQ>
[ 114.112835] __skb_ext_put+0x96/0xc0
[ 114.112840] napi_consume_skb+0x42/0x110
[ 114.112842] net_rx_action+0x14a/0x350
[ 114.112846] ? __napi_schedule+0xb6/0xc0
[ 114.112848] ? igb_msix_ring+0x6c/0x80 [igb 65a71327db3d237d6ebd4db22221016aa90703c9]
[ 114.112854] handle_softirqs+0xca/0x270
[ 114.112858] __irq_exit_rcu+0xbc/0xe0
[ 114.112860] common_interrupt+0x85/0xa0
[ 114.112863] </IRQ>
After that, the system quickly becomes unusable, the immediate crash
varies, often it's in a completely different part of kernel (e.g. amdgpu
driver).
Tomorrow I'll try reproducing with panic_on_warn so that I can get more
information.
Michal
> net/ipv4/udp.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 7d1444821ee51a19cd5fd0dd5b8d096104c9283c..0c40426628eb2306b609881341a51307c4993871 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1825,6 +1825,13 @@ void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len)
> if (unlikely(READ_ONCE(udp_sk(sk)->peeking_with_offset)))
> sk_peek_offset_bwd(sk, len);
>
> + if (!skb_shared(skb)) {
> + if (unlikely(udp_skb_has_head_state(skb)))
> + skb_release_head_state(skb);
> + skb_attempt_defer_free(skb);
> + return;
> + }
> +
> if (!skb_unref(skb))
> return;
>
> --
> 2.51.0.384.g4c02a37b29-goog
>
>
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists