linux-kernel - Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <92a4d42491a2c219192ae86fa04b579ea3676d8c.camel@redhat.com>
Date:   Tue, 04 Jul 2023 12:10:10 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Ian Kumlien <ian.kumlien@...il.com>
Cc:     Alexander Lobakin <aleksander.lobakin@...el.com>,
        intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, 2023-07-03 at 11:37 +0200, Ian Kumlien wrote:
> So, got back, switched to 6.4.1 and reran with kmemleak and kasan
> 
> I got the splat from:
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index cea28d30abb5..701c1b5cf532 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4328,6 +4328,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> 
>         skb->prev = tail;
> 
> +       if (WARN_ON_ONCE(!skb->next))
> +               goto err_linearize;
> +
>         if (skb_needs_linearize(skb, features) &&
>             __skb_linearize(skb))
>                 goto err_linearize;
> 
> I'm just happy i ran with dmesg -W since there was only minimal output
> on the console:
> [39914.833696] rcu: INFO: rcu_preempt self-detected stall on CPU
> [39914.839598] rcu:     2-....: (20997 ticks this GP)
> idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=4687
> [39914.849839] rcu:     (t=21017 jiffies g=18175157 q=45473 ncpus=12)
> [39977.862108] rcu: INFO: rcu_preempt self-detected stall on CPU
> [39977.868002] rcu:     2-....: (84001 ticks this GP)
> idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=28434
> [39977.878340] rcu:     (t=84047 jiffies g=18175157 q=263477 ncpus=12)
> [40040.892521] rcu: INFO: rcu_preempt self-detected stall on CPU
> [40040.898414] rcu:     2-....: (147006 ticks this GP)
> idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=53043
> [40040.908831] rcu:     (t=147079 jiffies g=18175157 q=464422 ncpus=12)
> [40065.080842] ixgbe 0000:06:00.1 eno2: Reset adapter

Ouch, just another slightly different issue, apparently :(

I'll try some wild guesses. The rcu stall could cause the OOM observed
in the previous tests. Here we the OOM did not trigger because due to
kasan/kmemleak the kernel is able to process a lesser number of packets
in the same period of time.

[...]
> [39914.857231] skb_segment (net/core/skbuff.c:4519)

I *think* this could be looping "forever", if gso_size becomes 0, which
is in turn completely unexpected ...


> [39914.857257] ? write_profile (kernel/stacktrace.c:83)
> [39914.857296] ? pskb_extract (net/core/skbuff.c:4360)
> [39914.857320] ? rt6_score_route (net/ipv6/route.c:713 (discriminator 1))
> [39914.857346] ? llist_add_batch (lib/llist.c:33 (discriminator 14))
> [39914.857379] __udp_gso_segment (net/ipv4/udp_offload.c:290)
> [39914.857413] ? ip6_dst_destroy (net/ipv6/route.c:788)
> [39914.857442] udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
> [39914.857472] ? udp6_gro_complete (net/ipv6/udp_offload.c:20)
> [39914.857498] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:53)
> [39914.857528] ipv6_gso_segment (net/ipv6/ip6_offload.c:119
> net/ipv6/ip6_offload.c:74)
> [39914.857557] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:76)
> [39914.857583] ? nft_update_chain_stats (net/netfilter/nf_tables_core.c:254)
> [39914.857612] ? fib6_select_path (net/ipv6/route.c:458)
> [39914.857643] skb_mac_gso_segment (net/core/gro.c:141)
> [39914.857673] ? skb_eth_gso_segment (net/core/gro.c:127)
> [39914.857702] ? ipv6_skip_exthdr (net/ipv6/exthdrs_core.c:190)
> [39914.857726] ? kasan_save_stack (mm/kasan/common.c:47)
> [39914.857758] __skb_gso_segment (net/core/dev.c:3401 (discriminator 2))
> [39914.857787] udpv6_queue_rcv_skb (./include/net/udp.h:492
> net/ipv6/udp.c:796 net/ipv6/udp.c:787)
> [39914.857816] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)

... but this means we are processing a multicast packet, likely skb is
cloned. If one of the clone instance enters simultaneusly
skb_segment_list() the latter would inconditionally call:

	skb_gso_reset(skb);

clearing the gso area in the shared info and causing unexpected results
(possibly the memory corruption observed before, and the above RCU
stall) for the other clone instances.

Assuming there are no other issues and that the above is not just a
side effect of ENOCOFFEE here, the following should possibly solve,
could you please add it to your testbed? (still with kasan+previous
patch, kmemleak is possibly not needed).

Thanks!
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..ac1ca6c7bff9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4263,6 +4263,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
 
 	skb_shinfo(skb)->frag_list = NULL;
 
+	/* later code will clear the gso area in the shared info */
+	err = skb_header_unclone(skb, GFP_ATOMIC);
+	if (err)
+		goto err_linearize;
+
 	while (list_skb) {
 		nskb = list_skb;
 		list_skb = list_skb->next;