lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9bf21296a4691ac5aca11ccd832765b262f7088.camel@redhat.com>
Date: Wed, 05 Jul 2023 12:28:40 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Ian Kumlien <ian.kumlien@...il.com>
Cc: Alexander Lobakin <aleksander.lobakin@...el.com>, intel-wired-lan
 <intel-wired-lan@...ts.osuosl.org>, Jakub Kicinski <kuba@...nel.org>, Eric
 Dumazet <edumazet@...gle.com>, "netdev@...r.kernel.org"
 <netdev@...r.kernel.org>,  "linux-kernel@...r.kernel.org"
 <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> More stacktraces.. =)
> 
> cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> [  411.413767] ------------[ cut here ]------------
> [  411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud	p.h:509
> udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> net/ipv6/udp.c:787)

I'm really running out of ideas here...

This is:

	WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);

sort of hint skb being shared (skb->users > 1) while enqueued in
multiple places (bridge local input and br forward/flood to tun
device). I audited the bridge mc flooding code, and I could not find
how a shared skb could land into the local input path.

Anyway the other splats reported here and in later emails are
compatible with shared skbs.

The above leads to another bunch of questions:
* can you reproduce the issue after disabling 'rx-gro-list' on the
ingress device? (while keeping 'rx-udp-gro-forwarding' on).
* do you have by chance qdiscs on top of the VM tun devices?

The last patch I shared was buggy, as it attempts to unclone the skb
after already touching skb_shared_info.

Could you please replace such patch with the following? 

Thanks!

Paolo
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..0b0f4309506d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4261,6 +4261,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
 
 	skb_push(skb, -skb_network_offset(skb) + offset);
 
+	if (WARN_ON_ONCE(skb_shared(skb))) {
+		skb = skb_share_check(skb, GFP_ATOMIC);
+		if (!skb)
+			goto err_linearize;
+	}
+
+	/* later code will clear the gso area in the shared info */
+	err = skb_unclone(skb, GFP_ATOMIC);
+	if (err)
+		goto err_linearize;
+
 	skb_shinfo(skb)->frag_list = NULL;
 
 	while (list_skb) {


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ