netdev - Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d9bf21296a4691ac5aca11ccd832765b262f7088.camel@redhat.com>
Date: Wed, 05 Jul 2023 12:28:40 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Ian Kumlien <ian.kumlien@...il.com>
Cc: Alexander Lobakin <aleksander.lobakin@...el.com>, intel-wired-lan
 <intel-wired-lan@...ts.osuosl.org>, Jakub Kicinski <kuba@...nel.org>, Eric
 Dumazet <edumazet@...gle.com>, "netdev@...r.kernel.org"
 <netdev@...r.kernel.org>,  "linux-kernel@...r.kernel.org"
 <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> More stacktraces.. =)
> 
> cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> [  411.413767] ------------[ cut here ]------------
> [  411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud	p.h:509
> udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> net/ipv6/udp.c:787)

I'm really running out of ideas here...

This is:

	WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);

sort of hint skb being shared (skb->users > 1) while enqueued in
multiple places (bridge local input and br forward/flood to tun
device). I audited the bridge mc flooding code, and I could not find
how a shared skb could land into the local input path.

Anyway the other splats reported here and in later emails are
compatible with shared skbs.

The above leads to another bunch of questions:
* can you reproduce the issue after disabling 'rx-gro-list' on the
ingress device? (while keeping 'rx-udp-gro-forwarding' on).
* do you have by chance qdiscs on top of the VM tun devices?

The last patch I shared was buggy, as it attempts to unclone the skb
after already touching skb_shared_info.

Could you please replace such patch with the following? 

Thanks!

Paolo
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..0b0f4309506d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4261,6 +4261,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

 	skb_push(skb, -skb_network_offset(skb) + offset);

+	if (WARN_ON_ONCE(skb_shared(skb))) {
+		skb = skb_share_check(skb, GFP_ATOMIC);
+		if (!skb)
+			goto err_linearize;
+	}
+
+	/* later code will clear the gso area in the shared info */
+	err = skb_unclone(skb, GFP_ATOMIC);
+	if (err)
+		goto err_linearize;
+
 	skb_shinfo(skb)->frag_list = NULL;

 	while (list_skb) {