linux-kernel - Re: [PATCH net-next] net:prevent shared skb corruption on rx-gro-list segmentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iK4M-=v0k_Aw_noWb_duDUZ54_3WKZV57BFRSQ1fhxcZA@mail.gmail.com>
Date:   Fri, 27 Oct 2023 14:39:52 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     Lena Wang (王娜) <Lena.Wang@...iatek.com>
Cc:     "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "matthias.bgg@...il.com" <matthias.bgg@...il.com>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mediatek@...ts.infradead.org" 
        <linux-mediatek@...ts.infradead.org>,
        "maze@...gle.com" <maze@...gle.com>,
        "cmllamas@...gle.com" <cmllamas@...gle.com>
Subject: Re: [PATCH net-next] net:prevent shared skb corruption on rx-gro-list segmentation

On Fri, Oct 27, 2023 at 1:28 PM Lena Wang (王娜) <Lena.Wang@...iatek.com> wrote:
>
> From 1d4aea846dd08d824e1097a933ed8618a6da03da Mon Sep 17 00:00:00 2001
> From: lena wang <lena.wang@...iatek.com>
> Date: Fri, 27 Oct 2023 17:52:50 +0800
> Subject: [PATCH net-next] net:prevent shared skb corruption on rx-gro-
> list segmentation
>
> shared skb triggers corruptions on rx-gro-list segmentation.
>
> [42438.749474] [ T1425] Unable to handle kernel NULL pointer
> dereference at virtual address 00000000000000d8
> [42438.749530] [ T1425] Mem abort info:
> [42438.749547] [ T1425]   ESR = 0x0000000096000006
> [42438.749566] [ T1425]   EC = 0x25: DABT (current EL), IL = 32 bits
> [42438.749588] [ T1425]   SET = 0, FnV = 0
> [42438.749606] [ T1425]   EA = 0, S1PTW = 0
> [42438.749623] [ T1425]   FSC = 0x06: level 2 translation fault
> [42438.749642] [ T1425] Data abort info:
> [42438.749656] [ T1425]   ISV = 0, ISS = 0x00000006
> [42438.749674] [ T1425]   CM = 0, WnR = 0
> [42438.749691] [ T1425] user pgtable: 4k pages, 39-bit VAs,
> pgdp=00000001f96b9000
> [42438.749714] [ T1425] [00000000000000d8] pgd=08000001f96ba003,
> p4d=08000001f96ba003, pud=08000001f96ba003, pmd=0000000000000000
> [42438.749769] [ T1425] Internal error: Oops: 0000000096000006 [#1]
> PREEMPT SMP
> [42438.749796] [ T1425] cpufreq stop DVFS log done
> [42438.797144] [ T1425] Kernel Offset: 0x29eb000000 from
> 0xffffffc008000000
> [42438.797185] [ T1425] PHYS_OFFSET: 0x40000000
> [42438.797203] [ T1425] pstate: 80400005 (Nzcv daif +PAN -UAO)
> [42438.797219] [ T1425] pc : [0xffffffe9f3e38c34]
> __udp_gso_segment+0x24c/0x48c
> [42438.797248] [ T1425] lr : [0xffffffe9f3e38bf4]
> __udp_gso_segment+0x20c/0x48c
> [42438.797269] [ T1425] sp : ffffffc017acb5c0
> [42438.797280] [ T1425] x29: ffffffc017acb5d0 x28: 000000000000c85f
> ......
> [42438.799172] [ T1425] Call trace:
> [42438.799184] [ T1425]  dump_backtrace+0xf4/0x118
> [42438.799210] [ T1425]  show_stack+0x18/0x24
> [42438.799227] [ T1425]  dump_stack_lvl+0x60/0x7c
> [42438.799250] [ T1425]  dump_stack+0x18/0x3c
> [42438.799270] [ T1425]  mrdump_common_die+0x24c/0x398 [mrdump]
> [42438.799354] [ T1425]  ipanic_die+0x20/0x34 [mrdump]
> [42438.799425] [ T1425]  notify_die+0x80/0xd8
> [42438.799447] [ T1425]  die+0x94/0x2b8
> [42438.799463] [ T1425]  __do_kernel_fault+0x264/0x298
> [42438.799485] [ T1425]  do_page_fault+0x98/0x4a0
> [42438.799503] [ T1425]  do_translation_fault+0x38/0x54
> [42438.799521] [ T1425]  do_mem_abort+0x58/0x118
> [42438.799540] [ T1425]  el1_abort+0x3c/0x5c
> [42438.799555] [ T1425]  el1h_64_sync_handler+0x54/0x90
> [42438.799575] [ T1425]  el1h_64_sync+0x68/0x6c
> [42438.799591] [ T1425]  __udp_gso_segment+0x24c/0x48c
> [42438.799611] [ T1425]  udp4_ufo_fragment+0x118/0x15c
> [42438.799628] [ T1425]  inet_gso_segment+0x164/0x338
> [42438.799644] [ T1425]  skb_mac_gso_segment+0xc4/0x13c
> [42438.799666] [ T1425]  __skb_gso_segment+0xc4/0x124
> [42438.799686] [ T1425]  validate_xmit_skb+0x9c/0x2e0
> [42438.799703] [ T1425]  validate_xmit_skb_list+0x4c/0x80
> [42438.799722] [ T1425]  sch_direct_xmit+0x70/0x3d0
> [42438.799739] [ T1425]  __dev_queue_xmit+0x5f0/0xd40
> [42438.799757] [ T1425]  ip_finish_output2+0x3f8/0x460
> [42438.799774] [ T1425]  __ip_finish_output+0x194/0x240
> [42438.799790] [ T1425]  ip_finish_output+0x20/0xf4
> [42438.799810] [ T1425]  ip_output+0x100/0x1a0
> [42438.799830] [ T1425]  NF_HOOK+0xac/0x154
> [42438.799848] [ T1425]  ip_forward+0x308/0x320
> [42438.799868] [ T1425]  ip_sublist_rcv+0x1f0/0x25c
> [42438.799887] [ T1425]  ip_list_rcv+0x138/0x174
> [42438.799905] [ T1425]  __netif_receive_skb_list_core+0x1e8/0x28c
> [42438.799923] [ T1425]  netif_receive_skb_list_internal+0x1ec/0x2b4
> [42438.799942] [ T1425]  netif_receive_skb_list+0x2c/0x144
> [42438.799960] [ T1425]  ccmni_queue_state_callback+0x98/0x304 [ccmni]
> [42438.800009] [ T1425]  port_net_queue_state_notify+0x178/0x1c0
> [ccci_md_all]
> [42438.800257] [ T1425]  ccci_port_queue_status_notify+0xf0/0x2a8
> [ccci_md_all]
> [42438.800496] [ T1425]  dpmaif_rxq_push_thread+0xb0/0x1d4
> [ccci_dpmaif]
> [42438.800602] [ T1425]  kthread+0x104/0x1d4
> [42438.800621] [ T1425]  ret_from_fork+0x10/0x20
>
> In some scenarios the GRO-ed skb shared with multi users. This
> segmentation touches the shared heads which sets frag_list to null.
> After linearization the skb->next is null which results the corruption.

What scenarios would do that ?

Packets sent to ndo_start_xmit() are not supposed to be shared. (with
pktgen exception, which is well understood)

A driver wants to be able to overwrite parts of skb (skb->next and
many other fields), this is not specific to GSO

A real fix (if needed) would probably be elsewhere.

>
> So for shared skb, it needs to clone first than unclone with header and
> data separated for different devices.
>
> Signed-off-by: lena wang <lena.wang@...iatek.com>

Why targeting net-next ?


> ---
>  net/core/skbuff.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index b157efea5dea..adeb3ad9697b 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4351,6 +4351,12 @@ struct sk_buff *skb_segment_list(struct sk_buff
> *skb,
>
>         skb_push(skb, -skb_network_offset(skb) + offset);
>
> +       if (skb_shared(skb)) {
> +               skb = skb_share_check(skb, GFP_ATOMIC);
> +               if (!skb)
> +                       goto err_linearize;
> +       }
> +
>         /* Ensure the head is writeable before touching the shared info
> */
>         err = skb_unclone(skb, GFP_ATOMIC);
>         if (err)
> --
> 2.18.0