[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6830807a4a4b1_180c78294e7@willemb.c.googlers.com.notmuch>
Date: Fri, 23 May 2025 10:04:42 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Shiming Cheng <shiming.cheng@...iatek.com>,
willemdebruijin.kernel@...il.com,
edumazet@...gle.com,
davem@...emloft.net,
kuba@...nel.org,
pabeni@...hat.com,
matthias.bgg@...il.com
Cc: linux-kernel@...r.kernel.org,
netdev@...r.kernel.org,
shiming.cheng@...iatek.com,
lena.wang@...iatek.com
Subject: Re: [PATCH] net: fix udp gso skb_segment after pull from frag_list
Shiming Cheng wrote:
> Detect invalid geometry due to pull from frag_list, and pass to
> regular skb_segment. if only part of the fraglist payload is pulled
This does not match the patch.
This is quoted from another patch that moves skb handling from
skb_segment_list to skb_segment.
This patch does not do that.
Btw, don't forget to target the right list. [PATCH net]
> into head_skb, When splitting packets in the skb_segment function,
> it will always cause exception as below.
>
> Valid SKB_GSO_FRAGLIST skbs
> - consist of two or more segments
> - the head_skb holds the protocol headers plus first gso_size
> - one or more frag_list skbs hold exactly one segment
> - all but the last must be gso_size
>
> Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> modify fraglist skbs, breaking these invariants.
>
> In extreme cases they pull one part of data into skb linear. For UDP,
> this causes three payloads with lengths of (11,11,10) bytes were
> pulled tail to become (12,10,10) bytes.
>
> When splitting packets in the skb_segment function, the first two
> packets of (11,11) bytes are split using skb_copy_bits. But when
> the last packet of 10 bytes is split, because hsize becomes nagative,
> it enters the skb_clone process instead of continuing to use
> skb_copy_bits. In fact, the data for skb_clone has already been
> copied into the second packet.
>
> when hsize < 0, the payload of the fraglist has already been copied
> (with skb_copy_bits), so there is no need to enter skb_clone to
> process this packet. Instead, continue using skb_copy_bits to process
> the next packet.
I don't fully follow this. And as always with skb_segment am concerned
that changing this condition may break other valid uses of this code.
If this is a fraglist gso packet (SKB_GSO_FRAGLIST), I have a mind to
just flag those that fail to meet the four invariants at the top, and
in that case take a slow path that linearizes the skb and passes that
to skb_segment.
That should take care of a whole host of such bad fraglist gso skbs,
without possibly breaking existing gso skb handling.
> el1h_64_sync_handler+0x3c/0x90
> el1h_64_sync+0x68/0x6c
> skb_segment+0xcd0/0xd14
> __udp_gso_segment+0x334/0x5f4
> udp4_ufo_fragment+0x118/0x15c
> inet_gso_segment+0x164/0x338
> skb_mac_gso_segment+0xc4/0x13c
> __skb_gso_segment+0xc4/0x124
> validate_xmit_skb+0x9c/0x2c0
> validate_xmit_skb_list+0x4c/0x80
> sch_direct_xmit+0x70/0x404
> __dev_queue_xmit+0x64c/0xe5c
> neigh_resolve_output+0x178/0x1c4
> ip_finish_output2+0x37c/0x47c
> __ip_finish_output+0x194/0x240
> ip_finish_output+0x20/0xf4
> ip_output+0x100/0x1a0
> NF_HOOK+0xc4/0x16c
> ip_forward+0x314/0x32c
> ip_rcv+0x90/0x118
> __netif_receive_skb+0x74/0x124
> process_backlog+0xe8/0x1a4
> __napi_poll+0x5c/0x1f8
> net_rx_action+0x154/0x314
> handle_softirqs+0x154/0x4b8
> __do_softirq+0x14/0x20
>
> [ 118.376811] [C201134] dpmaif_rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278!
> [ 118.376829] [C201134] dpmaif_rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
> [ 118.376858] [C201134] dpmaif_rxq0_pus: [name:mediatek_cpufreq_hw&]cpufreq stop DVFS log done
> [ 118.470774] [C201134] dpmaif_rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000
> [ 118.470810] [C201134] dpmaif_rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000
> [ 118.470827] [C201134] dpmaif_rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO)
> [ 118.470848] [C201134] dpmaif_rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14
> [ 118.470900] [C201134] dpmaif_rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14
> [ 118.470928] [C201134] dpmaif_rxq0_pus: [name:mrdump&]sp : ffffffc008013770
> [ 118.470941] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x29: ffffffc008013810 x28: 0000000000000040
> [ 118.470961] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x27: 000000000000002a x26: faffff81338f5500
> [ 118.470976] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x25: f9ffff800c87e000 x24: 0000000000000000
> [ 118.470991] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x23: 000000000000004b x22: f4ffff81338f4c00
> [ 118.471005] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x21: 000000000000000b x20: 0000000000000000
> [ 118.471019] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x19: fdffff8077db5dc8 x18: 0000000000000000
> [ 118.471033] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x17: 00000000ad6b63b6 x16: 00000000ad6b63b6
> [ 118.471047] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x15: ffffffd795aa59d4 x14: ffffffd795aa7bc4
> [ 118.471061] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x13: f4ffff806d40bc00 x12: 0000000100000000
> [ 118.471075] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x11: 0054000800000000 x10: 0000000000000040
> [ 118.471089] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x9 : 0000000000000040 x8 : 0000000000000055
> [ 118.471104] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x7 : ffffffd7959b0868 x6 : ffffffd7959aeebc
> [ 118.471118] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x5 : f8ffff8132ac5720 x4 : ffffffc0080134a8
> [ 118.471131] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x3 : 0000000000000a20 x2 : 0000000000000001
> [ 118.471145] [C201134] dpmaif_rxq0_pus: [name:mrdump&]x1 : 000000000000000a x0 : faffff81338f5500
>
> BUG_ON:
> pos += skb_headlen(list_skb);
> while (pos < offset + len) {
> BUG_ON(i >= nfrags);
> size = skb_frag_size(frag);
>
> Fixes: dbd50f238dec ("net: move the hsize check to the else block in skb_segment")
> Signed-off-by: Shiming Cheng <shiming.cheng@...iatek.com>
> ---
> net/core/skbuff.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6841e61a6bd0..f9888f8dc3fa 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4808,7 +4808,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>
> hsize = skb_headlen(head_skb) - offset;
>
> - if (hsize <= 0 && i >= nfrags && skb_headlen(list_skb) &&
> + if (hsize == 0 && i >= nfrags && skb_headlen(list_skb) &&
> (skb_headlen(list_skb) == len || sg)) {
> BUG_ON(skb_headlen(list_skb) > len);
>
> --
> 2.45.2
>
Powered by blists - more mailing lists