lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZtsTGp9FounnxZaN@arm.com>
Date: Fri, 6 Sep 2024 15:35:06 +0100
From: Szabolcs Nagy <szabolcs.nagy@....com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
	netdev@...r.kernel.org
Cc: davem@...emloft.net, kuba@...nel.org, edumazet@...gle.com,
	pabeni@...hat.com, mst@...hat.com, jasowang@...hat.com,
	arefev@...mel.ru, alexander.duyck@...il.com,
	Willem de Bruijn <willemb@...gle.com>, stable@...r.kernel.org,
	Jakub Sitnicki <jakub@...udflare.com>, Felix Fietkau <nbd@....name>,
	Mark Brown <broonie@...nel.org>,
	Yury Khrustalev <yury.khrustalev@....com>, nd@....com
Subject: Re: [PATCH net v2] net: drop bad gso csum_start and offset in
 virtio_net_hdr

The 07/29/2024 16:10, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@...gle.com>
> 
> Tighten csum_start and csum_offset checks in virtio_net_hdr_to_skb
> for GSO packets.
> 
> The function already checks that a checksum requested with
> VIRTIO_NET_HDR_F_NEEDS_CSUM is in skb linear. But for GSO packets
> this might not hold for segs after segmentation.
> 
> Syzkaller demonstrated to reach this warning in skb_checksum_help
> 
> 	offset = skb_checksum_start_offset(skb);
> 	ret = -EINVAL;
> 	if (WARN_ON_ONCE(offset >= skb_headlen(skb)))
> 
> By injecting a TSO packet:
> 
> WARNING: CPU: 1 PID: 3539 at net/core/dev.c:3284 skb_checksum_help+0x3d0/0x5b0
>  ip_do_fragment+0x209/0x1b20 net/ipv4/ip_output.c:774
>  ip_finish_output_gso net/ipv4/ip_output.c:279 [inline]
>  __ip_finish_output+0x2bd/0x4b0 net/ipv4/ip_output.c:301
>  iptunnel_xmit+0x50c/0x930 net/ipv4/ip_tunnel_core.c:82
>  ip_tunnel_xmit+0x2296/0x2c70 net/ipv4/ip_tunnel.c:813
>  __gre_xmit net/ipv4/ip_gre.c:469 [inline]
>  ipgre_xmit+0x759/0xa60 net/ipv4/ip_gre.c:661
>  __netdev_start_xmit include/linux/netdevice.h:4850 [inline]
>  netdev_start_xmit include/linux/netdevice.h:4864 [inline]
>  xmit_one net/core/dev.c:3595 [inline]
>  dev_hard_start_xmit+0x261/0x8c0 net/core/dev.c:3611
>  __dev_queue_xmit+0x1b97/0x3c90 net/core/dev.c:4261
>  packet_snd net/packet/af_packet.c:3073 [inline]
> 
> The geometry of the bad input packet at tcp_gso_segment:
> 
> [   52.003050][ T8403] skb len=12202 headroom=244 headlen=12093 tailroom=0
> [   52.003050][ T8403] mac=(168,24) mac_len=24 net=(192,52) trans=244
> [   52.003050][ T8403] shinfo(txflags=0 nr_frags=1 gso(size=1552 type=3 segs=0))
> [   52.003050][ T8403] csum(0x60000c7 start=199 offset=1536
> ip_summed=3 complete_sw=0 valid=0 level=0)
> 
> Mitigate with stricter input validation.
> 
> csum_offset: for GSO packets, deduce the correct value from gso_type.
> This is already done for USO. Extend it to TSO. Let UFO be:
> udp[46]_ufo_fragment ignores these fields and always computes the
> checksum in software.
> 
> csum_start: finding the real offset requires parsing to the transport
> header. Do not add a parser, use existing segmentation parsing. Thanks
> to SKB_GSO_DODGY, that also catches bad packets that are hw offloaded.
> Again test both TSO and USO. Do not test UFO for the above reason, and
> do not test UDP tunnel offload.
> 
> GSO packet are almost always CHECKSUM_PARTIAL. USO packets may be
> CHECKSUM_NONE since commit 10154dbded6d6 ("udp: Allow GSO transmit
> from devices with no checksum offload"), but then still these fields
> are initialized correctly in udp4_hwcsum/udp6_hwcsum_outgoing. So no
> need to test for ip_summed == CHECKSUM_PARTIAL first.
> 
> This revises an existing fix mentioned in the Fixes tag, which broke
> small packets with GSO offload, as detected by kselftests.
> 
> Link: https://syzkaller.appspot.com/bug?extid=e1db31216c789f552871
> Link: https://lore.kernel.org/netdev/20240723223109.2196886-1-kuba@kernel.org
> Fixes: e269d79c7d35 ("net: missing check virtio")
> Cc: stable@...r.kernel.org
> Signed-off-by: Willem de Bruijn <willemb@...gle.com>
> 
> ---
> 
> v1->v2
>   - skb_transport_header instead of skb->transport_header (edumazet@)
>   - typo: migitate -> mitigate
> ---

this breaks booting from nfs root on an arm64 fvp
model for me.

i see two fixup commits

commit 30b03f2a0592eee1267298298eac9dd655f55ab2
Author:     Jakub Sitnicki <jakub@...udflare.com>
AuthorDate: 2024-08-08 11:56:22 +0200
Commit:     Jakub Kicinski <kuba@...nel.org>
CommitDate: 2024-08-09 21:58:08 -0700

    udp: Fall back to software USO if IPv6 extension headers are present

and

commit b128ed5ab27330deeeaf51ea8bb69f1442a96f7f
Author:     Felix Fietkau <nbd@....name>
AuthorDate: 2024-08-19 17:06:21 +0200
Commit:     Jakub Kicinski <kuba@...nel.org>
CommitDate: 2024-08-21 17:15:05 -0700

    udp: fix receiving fraglist GSO packets

but they don't fix the issue for me,
at the boot console i see

...
[    3.686846] Sending DHCP requests ., OK
[    3.687302] IP-Config: Got DHCP answer from 172.20.51.254, my address is 172.20.51.1
[    3.687423] IP-Config: Complete:
[    3.687482]      device=eth0, hwaddr=ea:0d:79:71:af:cd, ipaddr=172.20.51.1, mask=255.255.255.0, gw=172.20.51.254
[    3.687631]      host=172.20.51.1, domain=, nis-domain=(none)
[    3.687719]      bootserver=172.20.51.254, rootserver=10.2.80.41, rootpath=
[    3.687771]      nameserver0=172.20.51.254, nameserver1=172.20.51.252, nameserver2=172.20.51.251
[    3.689075] clk: Disabling unused clocks
[    3.689167] PM: genpd: Disabling unused power domains
[    3.689258] ALSA device list:
[    3.689330]   No soundcards found.
[    3.716297] VFS: Mounted root (nfs4 filesystem) on device 0:24.
[    3.716843] devtmpfs: mounted
[    3.734352] Freeing unused kernel memory: 10112K
[    3.735178] Run /sbin/init as init process
[    3.743770] eth0: bad gso: type: 1, size: 1440
[    3.744186] eth0: bad gso: type: 1, size: 1440
...
[  154.610991] eth0: bad gso: type: 1, size: 1440
[  185.330941] nfs: server 10.2.80.41 not responding, still trying
...

the "bad gso" message keeps repeating and init
is not executed.

if i revert the 3 patches above on 6.11-rc6 then
init runs without "bad gso" error.

this affects testing the arm64-gcs patches on
top of 6.11-rc3 and 6.11-rc6

not sure if this is an fvp or kernel bug.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ