lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65f5a2e1ed02e_6ef3e29463@willemb.c.googlers.com.notmuch>
Date: Sat, 16 Mar 2024 09:47:13 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Lena Wang (王娜) <Lena.Wang@...iatek.com>, 
 "davem@...emloft.net" <davem@...emloft.net>, 
 "kuba@...nel.org" <kuba@...nel.org>, 
 Shiming Cheng (成诗明) <Shiming.Cheng@...iatek.com>, 
 "pabeni@...hat.com" <pabeni@...hat.com>, 
 "edumazet@...gle.com" <edumazet@...gle.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net] udp: fix segmentation crash for untrusted source
 packet

Lena Wang (王娜) wrote:
> On Wed, 2024-03-13 at 16:41 +0100, Paolo Abeni wrote:
> >  	 
> > External email : Please do not click links or open attachments until
> > you have verified the sender or the content.
> >  On Wed, 2024-03-13 at 21:34 +0800, Shiming Cheng wrote:
> > > Kernel exception is reported when making udp frag list
> > segmentation.
> > > Backtrace is as below:
> > >     at out/android15-6.6/kernel-6.6/kernel-
> > 6.6/net/ipv4/udp_offload.c:229
> > >     at out/android15-6.6/kernel-6.6/kernel-
> > 6.6/net/ipv4/udp_offload.c:262
> > > features=features@...ry=19, is_ipv6=false)
> > >     at out/android15-6.6/kernel-6.6/kernel-
> > 6.6/net/ipv4/udp_offload.c:289
> > > features=19)
> > >     at out/android15-6.6/kernel-6.6/kernel-
> > 6.6/net/ipv4/udp_offload.c:399
> > > features=19)
> > >     at out/android15-6.6/kernel-6.6/kernel-
> > 6.6/net/ipv4/af_inet.c:1418
> > > skb@...ry=0x0, features=19, features@...ry=0)
> > >     at out/android15-6.6/kernel-6.6/kernel-6.6/net/core/gso.c:53
> > > tx_path=<optimized out>)
> > >     at out/android15-6.6/kernel-6.6/kernel-6.6/net/core/gso.c:124
> > 
> > A full backtrace would help better understanding the issue.
> 
> Below is full backtrace:
>  [ 1100.812205][    C3] CPU: 3 PID: 0 Comm: swapper/3 Tainted:
> G        W  OE      6.6.17-android15-0-g380371ea9bf1 #1
>  [ 1100.812211][    C3] Hardware name: MT6991(ENG) (DT)
>  [ 1100.812215][    C3] Call trace:
>  [ 1100.812218][    C3]  dump_backtrace+0xec/0x138
>  [ 1100.812222][    C3]  show_stack+0x18/0x24
>  [ 1100.812226][    C3]  dump_stack_lvl+0x50/0x6c
>  [ 1100.812232][    C3]  dump_stack+0x18/0x24
>  [ 1100.812237][    C3]  mrdump_common_die+0x24c/0x388 [mrdump]
>  [ 1100.812259][    C3]  ipanic_die+0x20/0x34 [mrdump]
>  [ 1100.812269][    C3]  notifier_call_chain+0x90/0x174
>  [ 1100.812275][    C3]  notify_die+0x50/0x8c
>  [ 1100.812279][    C3]  die+0x94/0x308
>  [ 1100.812283][    C3]  __do_kernel_fault+0x240/0x26c
>  [ 1100.812288][    C3]  do_page_fault+0xa0/0x48c
>  [ 1100.812293][    C3]  do_translation_fault+0x38/0x54
>  [ 1100.812297][    C3]  do_mem_abort+0x58/0x104
>  [ 1100.812302][    C3]  el1_abort+0x3c/0x5c
>  [ 1100.812307][    C3]  el1h_64_sync_handler+0x54/0x90
>  [ 1100.812313][    C3]  el1h_64_sync+0x68/0x6c
>  [ 1100.812317][    C3]  __udp_gso_segment+0x298/0x4d4
>  [ 1100.812322][    C3]  udp4_ufo_fragment+0x130/0x174
>  [ 1100.812326][    C3]  inet_gso_segment+0x164/0x330
>  [ 1100.812330][    C3]  skb_mac_gso_segment+0xc4/0x13c
>  [ 1100.812335][    C3]  __skb_gso_segment+0xc4/0x120
>  [ 1100.812339][    C3]  udp_rcv_segment+0x50/0x134
>  [ 1100.812344][    C3]  udp_queue_rcv_skb+0x74/0x114
>  [ 1100.812348][    C3]  udp_unicast_rcv_skb+0x94/0xac
>  [ 1100.812353][    C3]  __udp4_lib_rcv+0x3e0/0x818
>  [ 1100.812358][    C3]  udp_rcv+0x20/0x30
>  [ 1100.812362][    C3]  ip_protocol_deliver_rcu+0x194/0x368
>  [ 1100.812368][    C3]  ip_local_deliver+0xe4/0x184
>  [ 1100.812373][    C3]  ip_rcv+0x90/0x118
>  [ 1100.812378][    C3]  __netif_receive_skb+0x74/0x124
>  [ 1100.812383][    C3]  process_backlog+0xd8/0x18c
>  [ 1100.812388][    C3]  __napi_poll+0x5c/0x1fc
>  [ 1100.812392][    C3]  net_rx_action+0x150/0x334
>  [ 1100.812397][    C3]  __do_softirq+0x120/0x3f4
>  [ 1100.812401][    C3]  ____do_softirq+0x10/0x20
>  [ 1100.812405][    C3]  call_on_irq_stack+0x3c/0x74
>  [ 1100.812410][    C3]  do_softirq_own_stack+0x1c/0x2c
>  [ 1100.812414][    C3]  __irq_exit_rcu+0x5c/0xd4
>  [ 1100.812418][    C3]  irq_exit_rcu+0x10/0x1c
>  [ 1100.812422][    C3]  el1_interrupt+0x38/0x58
>  [ 1100.812428][    C3]  el1h_64_irq_handler+0x18/0x24
>  [ 1100.812434][    C3]  el1h_64_irq+0x68/0x6c
>  [ 1100.812437][    C3]  arch_local_irq_enable+0x4/0x8
>  [ 1100.812443][    C3]  cpuidle_enter+0x38/0x54
>  [ 1100.812449][    C3]  do_idle+0x198/0x294
>  [ 1100.812454][    C3]  cpu_startup_entry+0x34/0x3c
>  [ 1100.812459][    C3]  secondary_start_kernel+0x138/0x158
>  [ 1100.812465][    C3]  __secondary_switched+0xc0/0xc4
> 
> > > This packet's frag list is null while gso_type is not 0. Then it is
> > treated
> > > as a GRO-ed packet and sent to segment frag list. Function call
> > path is
> > > udp_rcv_segment => config features value
> > >     __udpv4_gso_segment  => skb_gso_ok returns false. Here it
> > should be
> > >                             true. 
> > 
> > Why? If I read correctly the above, this is GSO packet landing in an
> > UDP socket with no UDP_GRO sockopt. The packet is expected to be
> > segmented again.
> > 
> Yes, it is GSO packet, however the fragment list of this GSO packet
> becomes NULL. As the occurrence rate is very low, we really don’t know
> why and when it becomes to be NULL. It happens both in cellular and
> wlan network and seems an unknown kernel issue.
>
> To avoid crash the packet should skip to be segmented when fraglist is
> null.
> 
> > >Failed reason is features doesn't
> > match
> > >                             gso_type.
> > >         __udp_gso_segment_list
> > >             skb_segment_list => packet is linear with skb->next =
> > NULL
> > >             __udpv4_gso_segment_list_csum => use skb->next directly
> > and
> > >                                              crash happens
> > > 
> > > In rx-gro-list GRO-ed packet is set gso type as
> > > NETIF_F_GSO_UDP_L4 | NETIF_F_GSO_FRAGLIST in napi_gro_complete. In
> > gso
> > > flow the features should also set them to match with gso_type. Or
> > else it
> > > will always return false in skb_gso_ok. Then it can't discover the
> > > untrusted source packet and result crash in following function.
> > 
> > What is the 'untrusted source' here? I read the above as the packet
> > aggregation happened in the GRO engine???
> > 
> > Could you please give a complete description of the relevant
> > scenario?
> > 
> 
> According to the backtrace info, we infer it is a rx-frag_list GRO

It would be helpful to see an skb_dump. But if this happens rarely in
production, understood if that is not feasible.

The packet arrives on process_backlog, so still not sure how it is
produced.

> packet. Before sending into the UDP socket with no UDP_GRO sockopt, it
> seems enter "skb_condense" to trim it and loose his frag list. However
> it still keeps gso_type and gso_size. Then it continues to do
> skb_segment_list.
> 
> First crash happens in skb_segment_list. 
> This patch resolves the crash and lets the packet becomes a skb without
> skb->next:
> https://lore.kernel.org/all/Y9gt5EUizK1UImEP@debian/
> Then crash moves to __udp_gso_sement_list -> skb_segment_list(finish)
> -> __udpv4_gso_segment_list_csum, it uses skb->next without check then
> crash.
> 
> 
> What we want to do is to drop this abnormal packet.

I think we want to deliver this packet if possible.

Thanks for the added context. So this is assumed to be a GSO skb with
SKB_GSO_FRAGLIST that somewhere lots its fraglist? That is the bug
if true.

You are suggesting that this happens in the skb_condense in
__udp_enqueue_schedule_skb?

If generated by GRO then on a device that has NETIF_F_GRO_FRAGLIST set.
So one workaround (not fix) is to disable that.

> So we set features
> NETIF_F_GSO_UDP_L4 |NETIF_F_GSO_FRAGLIST to match fixes: f2696099c6c6
> condation then drop it. 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ