lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ed73z3oe.fsf@cloudflare.com>
Date: Mon, 05 Aug 2024 12:04:17 +0200
From: Jakub Sitnicki <jakub@...udflare.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org,  "David S. Miller" <davem@...emloft.net>,  Eric
 Dumazet <edumazet@...gle.com>,  Jakub Kicinski <kuba@...nel.org>,  Paolo
 Abeni <pabeni@...hat.com>,  Willem de Bruijn <willemb@...gle.com>,
  kernel-team@...udflare.com,
  syzbot+e15b7e15b8a751a91d9a@...kaller.appspotmail.com
Subject: Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device
 supports requested GSO

On Thu, Aug 01, 2024 at 03:13 PM -04, Willem de Bruijn wrote:
> On Thu, Aug 1, 2024 at 10:09 AM Jakub Sitnicki <jakub@...udflare.com> wrote:
>>
>> In commit 10154dbded6d ("udp: Allow GSO transmit from devices with no
>> checksum offload") we have intentionally allowed UDP GSO packets marked
>> CHECKSUM_NONE to pass to the GSO stack, so that they can be segmented and
>> checksummed by a software fallback when the egress device lacks these
>> features.
>>
>> What was not taken into consideration is that a CHECKSUM_NONE skb can be
>> handed over to the GSO stack also when the egress device advertises the
>> tx-udp-segmentation / NETIF_F_GSO_UDP_L4 feature.
>>
>> This can happen in two situations, which we detect in __ip_append_data()
>> and __ip6_append_data():
>>
>> 1) when there are IPv6 extension headers present, or
>> 2) when the tunnel device does not advertise checksum offload.
>>
>> Note that in the latter case we have a nonsensical device configuration.
>> Device support for UDP segmentation offload requires checksum offload in
>> hardware as well.
>>
>> Syzbot has discovered the first case, producing a warning as below:
>>
>>   ip6tnl0: caps=(0x00000006401d7869, 0x00000006401d7869)
>>   WARNING: CPU: 0 PID: 5112 at net/core/dev.c:3293 skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
>>   Modules linked in:
>>   CPU: 0 PID: 5112 Comm: syz-executor391 Not tainted 6.10.0-rc7-syzkaller-01603-g80ab5445da62 #0
>>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
>>   RIP: 0010:skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
>>   [...]
>>   Call Trace:
>>    <TASK>
>>    __skb_gso_segment+0x3be/0x4c0 net/core/gso.c:127
>>    skb_gso_segment include/net/gso.h:83 [inline]
>>    validate_xmit_skb+0x585/0x1120 net/core/dev.c:3661
>>    __dev_queue_xmit+0x17a4/0x3e90 net/core/dev.c:4415
>>    neigh_output include/net/neighbour.h:542 [inline]
>>    ip6_finish_output2+0xffa/0x1680 net/ipv6/ip6_output.c:137
>>    ip6_finish_output+0x41e/0x810 net/ipv6/ip6_output.c:222
>>    ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1958
>>    udp_v6_send_skb+0xbf5/0x1870 net/ipv6/udp.c:1292
>>    udpv6_sendmsg+0x23b3/0x3270 net/ipv6/udp.c:1588
>>    sock_sendmsg_nosec net/socket.c:730 [inline]
>>    __sock_sendmsg+0xef/0x270 net/socket.c:745
>>    ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
>>    ___sys_sendmsg net/socket.c:2639 [inline]
>>    __sys_sendmmsg+0x3b2/0x740 net/socket.c:2725
>>    __do_sys_sendmmsg net/socket.c:2754 [inline]
>>    __se_sys_sendmmsg net/socket.c:2751 [inline]
>>    __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2751
>>    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>    [...]
>>    </TASK>
>>
>> We are hitting the bad offload warning because when an egress device is
>> capable of handling segmentation offload requested by
>> skb_shinfo(skb)->gso_type, the chain of gso_segment callbacks won't produce
>> any segment skbs and return NULL. See the skb_gso_ok() branch in
>> {__udp,tcp,sctp}_gso_segment helpers.
>>
>> To fix it, skip bad offload detection when gso_segment has returned
>> nothing. We know that in such case the egress device supports the desired
>> GSO offload, which implies that it can fill in L4 checksums. Hence we don't
>> need to check the skb->ip_summed value, which reflects the egress device
>> checksum capabilities.
>>
>> Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload")
>> Reported-by: syzbot+e15b7e15b8a751a91d9a@...kaller.appspotmail.com
>> Closes: https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/
>> Signed-off-by: Jakub Sitnicki <jakub@...udflare.com>
>
> Reviewed-by: Willem de Bruijn <willemb@...gle.com>
>
> It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
> being ignored and devices are trusted to always be able to checksum
> offload when they can segment offload -- even when the device does not
> advertise checksum offload.
>
> I think we should have a follow-on that makes advertising
> NETIF_F_GSO_UDP_L4 dependent on having at least one of the
> NETIF_F_*_CSUM bits set (handwaving over what happens when only
> advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).

I agree. I've also gained some clarity as to how the fix should
look. Let's circle back to it, if we still think it's relevant once we
hash out the fix.

After spending some quality time debugging the addded regression test
[1], I've realized this fix is wrong.

You see, with commit 10154dbded6d ("udp: Allow GSO transmit from devices
with no checksum offload"), I've opened up the UDP_SEGMENT API to two
uses, which I think should not be allowed:

1. Hardware USO for IPv6 dgrams with extension headers

Previously that led to -EIO, because __ip6_append_data won't annotate
such packets as CHECKSUM_PARTIAL.

I'm guessing that we do this because some drivers that advertise csum
offload can't actually handle checksumming when extension headers are
present.

Extension headers are not part of IPv6 pseudo header, but who knows what
limitations NIC firmwares have.

Either way, changing it just like that sounds risky, so I think we need
to fall back to software USO with software checksum in this case.

Alternatively, we could catch it in the udp layer and error out with EIO
as ealier. But that shifts some burden onto the user space (detect and
segment before sendmsg()).

2. Hardware USO when hardware csum is unsupported / disabled

That sounds like a pathological device configuration case, but since it
is possible today with some drivers to disable csum offload for one IP
version and not the other, it seems safest to just handle that
gracefully.

I don't know why one might want to do that. Perhaps as a workaround for
some firmware bug while waiting for a fix?

In this scenario I think we also need to fall back to software USO and
checksum.

Code-wise that could look like below. WDYT?

[1] I feel silly for submitting a broken patch. I've been running a
stale test prog in a VM. It seems that virtme-ng with a read+write
overlay for rootfs played a trick on me. Changes to the host files are
not (always?) visible to the guest.

---8<---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 83f8cd8aa2d1..819173807c81 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4806,7 +4806,13 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 			goto perform_csum_check;
 
 		if (!sg) {
-			if (!csum) {
+			/* Fall back to software checksum for segments if:
+			 * 1) device can't checksum this network protocol, OR
+			 * 2) we consider the packet to be not checksummable in
+			 *    hardware, for example IPv6 extension headers are
+			 *    present.
+			 */
+			if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
 				if (!nskb->remcsum_offload)
 					nskb->ip_summed = CHECKSUM_NONE;
 				SKB_GSO_CB(nskb)->csum =
@@ -4896,7 +4902,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 		nskb->truesize += nskb->data_len;
 
 perform_csum_check:
-		if (!csum) {
+		if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
 			if (skb_has_shared_frag(nskb) &&
 			    __skb_linearize(nskb))
 				goto err;
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index bc8a9da750fe..0a34b418b83c 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -282,7 +282,15 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
 		     skb_transport_header(gso_skb)))
 		return ERR_PTR(-EINVAL);
 
-	if (skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
+	/* Pass-through to device for segmentation only if:
+	 * 1) we consider the packet checksummable in hardware, that is no
+	 *    IPv6 extension headers present, AND
+	 * 2) device supports checksum offload for this network protocol
+	 *    (NETIF_F_{IP,IPV6}_CSUM or NETIF_F_HW_CSUM), AND
+	 * 3) device supports the requested GSO kind.
+	 */
+	if (gso_skb->ip_summed == CHECKSUM_PARTIAL &&
+	    skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
 		/* Packet is from an untrusted source, reset gso_segs. */
 		skb_shinfo(gso_skb)->gso_segs = DIV_ROUND_UP(gso_skb->len - sizeof(*uh),
 							     mss);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ