lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <66b0e0d3c2119_2f5edf294c1@willemb.c.googlers.com.notmuch>
Date: Mon, 05 Aug 2024 10:25:23 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jakub Sitnicki <jakub@...udflare.com>, 
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org, 
 "David S. Miller" <davem@...emloft.net>, 
 Eric Dumazet <edumazet@...gle.com>, 
 Jakub Kicinski <kuba@...nel.org>, 
 Paolo Abeni <pabeni@...hat.com>, 
 Willem de Bruijn <willemb@...gle.com>, 
 kernel-team@...udflare.com, 
 syzbot+e15b7e15b8a751a91d9a@...kaller.appspotmail.com
Subject: Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device
 supports requested GSO

Jakub Sitnicki wrote:
> On Thu, Aug 01, 2024 at 03:13 PM -04, Willem de Bruijn wrote:
> > On Thu, Aug 1, 2024 at 10:09 AM Jakub Sitnicki <jakub@...udflare.com> wrote:
> >>
> >> In commit 10154dbded6d ("udp: Allow GSO transmit from devices with no
> >> checksum offload") we have intentionally allowed UDP GSO packets marked
> >> CHECKSUM_NONE to pass to the GSO stack, so that they can be segmented and
> >> checksummed by a software fallback when the egress device lacks these
> >> features.
> >>
> >> What was not taken into consideration is that a CHECKSUM_NONE skb can be
> >> handed over to the GSO stack also when the egress device advertises the
> >> tx-udp-segmentation / NETIF_F_GSO_UDP_L4 feature.
> >>
> >> This can happen in two situations, which we detect in __ip_append_data()
> >> and __ip6_append_data():
> >>
> >> 1) when there are IPv6 extension headers present, or
> >> 2) when the tunnel device does not advertise checksum offload.
> >>
> >> Note that in the latter case we have a nonsensical device configuration.
> >> Device support for UDP segmentation offload requires checksum offload in
> >> hardware as well.
> >>
> >> Syzbot has discovered the first case, producing a warning as below:
> >>
> >>   ip6tnl0: caps=(0x00000006401d7869, 0x00000006401d7869)
> >>   WARNING: CPU: 0 PID: 5112 at net/core/dev.c:3293 skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
> >>   Modules linked in:
> >>   CPU: 0 PID: 5112 Comm: syz-executor391 Not tainted 6.10.0-rc7-syzkaller-01603-g80ab5445da62 #0
> >>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
> >>   RIP: 0010:skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
> >>   [...]
> >>   Call Trace:
> >>    <TASK>
> >>    __skb_gso_segment+0x3be/0x4c0 net/core/gso.c:127
> >>    skb_gso_segment include/net/gso.h:83 [inline]
> >>    validate_xmit_skb+0x585/0x1120 net/core/dev.c:3661
> >>    __dev_queue_xmit+0x17a4/0x3e90 net/core/dev.c:4415
> >>    neigh_output include/net/neighbour.h:542 [inline]
> >>    ip6_finish_output2+0xffa/0x1680 net/ipv6/ip6_output.c:137
> >>    ip6_finish_output+0x41e/0x810 net/ipv6/ip6_output.c:222
> >>    ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1958
> >>    udp_v6_send_skb+0xbf5/0x1870 net/ipv6/udp.c:1292
> >>    udpv6_sendmsg+0x23b3/0x3270 net/ipv6/udp.c:1588
> >>    sock_sendmsg_nosec net/socket.c:730 [inline]
> >>    __sock_sendmsg+0xef/0x270 net/socket.c:745
> >>    ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
> >>    ___sys_sendmsg net/socket.c:2639 [inline]
> >>    __sys_sendmmsg+0x3b2/0x740 net/socket.c:2725
> >>    __do_sys_sendmmsg net/socket.c:2754 [inline]
> >>    __se_sys_sendmmsg net/socket.c:2751 [inline]
> >>    __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2751
> >>    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >>    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> >>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >>    [...]
> >>    </TASK>
> >>
> >> We are hitting the bad offload warning because when an egress device is
> >> capable of handling segmentation offload requested by
> >> skb_shinfo(skb)->gso_type, the chain of gso_segment callbacks won't produce
> >> any segment skbs and return NULL. See the skb_gso_ok() branch in
> >> {__udp,tcp,sctp}_gso_segment helpers.
> >>
> >> To fix it, skip bad offload detection when gso_segment has returned
> >> nothing. We know that in such case the egress device supports the desired
> >> GSO offload, which implies that it can fill in L4 checksums. Hence we don't
> >> need to check the skb->ip_summed value, which reflects the egress device
> >> checksum capabilities.
> >>
> >> Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload")
> >> Reported-by: syzbot+e15b7e15b8a751a91d9a@...kaller.appspotmail.com
> >> Closes: https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/
> >> Signed-off-by: Jakub Sitnicki <jakub@...udflare.com>
> >
> > Reviewed-by: Willem de Bruijn <willemb@...gle.com>
> >
> > It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
> > being ignored and devices are trusted to always be able to checksum
> > offload when they can segment offload -- even when the device does not
> > advertise checksum offload.
> >
> > I think we should have a follow-on that makes advertising
> > NETIF_F_GSO_UDP_L4 dependent on having at least one of the
> > NETIF_F_*_CSUM bits set (handwaving over what happens when only
> > advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).
> 
> I agree. I've also gained some clarity as to how the fix should
> look. Let's circle back to it, if we still think it's relevant once we
> hash out the fix.
> 
> After spending some quality time debugging the addded regression test
> [1], I've realized this fix is wrong.
> 
> You see, with commit 10154dbded6d ("udp: Allow GSO transmit from devices
> with no checksum offload"), I've opened up the UDP_SEGMENT API to two
> uses, which I think should not be allowed:
> 
> 1. Hardware USO for IPv6 dgrams with extension headers
> 
> Previously that led to -EIO, because __ip6_append_data won't annotate
> such packets as CHECKSUM_PARTIAL.
> 
> I'm guessing that we do this because some drivers that advertise csum
> offload can't actually handle checksumming when extension headers are
> present.
> 
> Extension headers are not part of IPv6 pseudo header, but who knows what
> limitations NIC firmwares have.
> 
> Either way, changing it just like that sounds risky, so I think we need
> to fall back to software USO with software checksum in this case.
> 
> Alternatively, we could catch it in the udp layer and error out with EIO
> as ealier. But that shifts some burden onto the user space (detect and
> segment before sendmsg()).
> 
> 2. Hardware USO when hardware csum is unsupported / disabled
> 
> That sounds like a pathological device configuration case, but since it
> is possible today with some drivers to disable csum offload for one IP
> version and not the other, it seems safest to just handle that
> gracefully.
> 
> I don't know why one might want to do that. Perhaps as a workaround for
> some firmware bug while waiting for a fix?

I doubt that this is actually used. But today it can be configured.

Which is why I suggested making NETIF_F_GSO_UDP_L4 dependent on csum
offload (in netdev_fix_features). I doubt that that will break any
real user.
 
> In this scenario I think we also need to fall back to software USO and
> checksum.
> 
> Code-wise that could look like below. WDYT?

Since this only affects USO, can we fix this is in __udp_gso_segment.
Basically, not taking the NETIF_F_GSO_ROBUST path.

skb_segment is so complicated already. Whatever we can do to avoid
adding to that.

> [1] I feel silly for submitting a broken patch. I've been running a
> stale test prog in a VM. It seems that virtme-ng with a read+write
> overlay for rootfs played a trick on me. Changes to the host files are
> not (always?) visible to the guest.
> 
> ---8<---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 83f8cd8aa2d1..819173807c81 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4806,7 +4806,13 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>  			goto perform_csum_check;
>  
>  		if (!sg) {
> -			if (!csum) {
> +			/* Fall back to software checksum for segments if:
> +			 * 1) device can't checksum this network protocol, OR
> +			 * 2) we consider the packet to be not checksummable in
> +			 *    hardware, for example IPv6 extension headers are
> +			 *    present.
> +			 */
> +			if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
>  				if (!nskb->remcsum_offload)
>  					nskb->ip_summed = CHECKSUM_NONE;
>  				SKB_GSO_CB(nskb)->csum =
> @@ -4896,7 +4902,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>  		nskb->truesize += nskb->data_len;
>  
>  perform_csum_check:
> -		if (!csum) {
> +		if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
>  			if (skb_has_shared_frag(nskb) &&
>  			    __skb_linearize(nskb))
>  				goto err;
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index bc8a9da750fe..0a34b418b83c 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -282,7 +282,15 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>  		     skb_transport_header(gso_skb)))
>  		return ERR_PTR(-EINVAL);
>  
> -	if (skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
> +	/* Pass-through to device for segmentation only if:
> +	 * 1) we consider the packet checksummable in hardware, that is no
> +	 *    IPv6 extension headers present, AND
> +	 * 2) device supports checksum offload for this network protocol
> +	 *    (NETIF_F_{IP,IPV6}_CSUM or NETIF_F_HW_CSUM), AND
> +	 * 3) device supports the requested GSO kind.
> +	 */
> +	if (gso_skb->ip_summed == CHECKSUM_PARTIAL &&
> +	    skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
>  		/* Packet is from an untrusted source, reset gso_segs. */
>  		skb_shinfo(gso_skb)->gso_segs = DIV_ROUND_UP(gso_skb->len - sizeof(*uh),
>  							     mss);



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ