lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 26 Mar 2024 13:10:07 +0000
From: Lena Wang (王娜) <Lena.Wang@...iatek.com>
To: "davem@...emloft.net" <davem@...emloft.net>, "kuba@...nel.org"
	<kuba@...nel.org>, Shiming Cheng (成诗明)
	<Shiming.Cheng@...iatek.com>, "pabeni@...hat.com" <pabeni@...hat.com>,
	"willemdebruijn.kernel@...il.com" <willemdebruijn.kernel@...il.com>,
	"edumazet@...gle.com" <edumazet@...gle.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net] udp: fix segmentation crash for untrusted source
 packet

On Sat, 2024-03-16 at 09:47 -0400, Willem de Bruijn wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  Lena Wang (王娜) wrote:
> > On Wed, 2024-03-13 at 16:41 +0100, Paolo Abeni wrote:
> > >   
> > > External email : Please do not click links or open attachments
> until
> > > you have verified the sender or the content.
> > >  On Wed, 2024-03-13 at 21:34 +0800, Shiming Cheng wrote:
> > > > Kernel exception is reported when making udp frag list
> > > segmentation.
> > > > Backtrace is as below:
> > > >     at out/android15-6.6/kernel-6.6/kernel-
> > > 6.6/net/ipv4/udp_offload.c:229
> > > >     at out/android15-6.6/kernel-6.6/kernel-
> > > 6.6/net/ipv4/udp_offload.c:262
> > > > features=features@...ry=19, is_ipv6=false)
> > > >     at out/android15-6.6/kernel-6.6/kernel-
> > > 6.6/net/ipv4/udp_offload.c:289
> > > > features=19)
> > > >     at out/android15-6.6/kernel-6.6/kernel-
> > > 6.6/net/ipv4/udp_offload.c:399
> > > > features=19)
> > > >     at out/android15-6.6/kernel-6.6/kernel-
> > > 6.6/net/ipv4/af_inet.c:1418
> > > > skb@...ry=0x0, features=19, features@...ry=0)
> > > >     at out/android15-6.6/kernel-6.6/kernel-
> 6.6/net/core/gso.c:53
> > > > tx_path=<optimized out>)
> > > >     at out/android15-6.6/kernel-6.6/kernel-
> 6.6/net/core/gso.c:124
> > > 
> > > A full backtrace would help better understanding the issue.
> > 
> > Below is full backtrace:
> >  [ 1100.812205][    C3] CPU: 3 PID: 0 Comm: swapper/3 Tainted:
> > G        W  OE      6.6.17-android15-0-g380371ea9bf1 #1
> >  [ 1100.812211][    C3] Hardware name: MT6991(ENG) (DT)
> >  [ 1100.812215][    C3] Call trace:
> >  [ 1100.812218][    C3]  dump_backtrace+0xec/0x138
> >  [ 1100.812222][    C3]  show_stack+0x18/0x24
> >  [ 1100.812226][    C3]  dump_stack_lvl+0x50/0x6c
> >  [ 1100.812232][    C3]  dump_stack+0x18/0x24
> >  [ 1100.812237][    C3]  mrdump_common_die+0x24c/0x388 [mrdump]
> >  [ 1100.812259][    C3]  ipanic_die+0x20/0x34 [mrdump]
> >  [ 1100.812269][    C3]  notifier_call_chain+0x90/0x174
> >  [ 1100.812275][    C3]  notify_die+0x50/0x8c
> >  [ 1100.812279][    C3]  die+0x94/0x308
> >  [ 1100.812283][    C3]  __do_kernel_fault+0x240/0x26c
> >  [ 1100.812288][    C3]  do_page_fault+0xa0/0x48c
> >  [ 1100.812293][    C3]  do_translation_fault+0x38/0x54
> >  [ 1100.812297][    C3]  do_mem_abort+0x58/0x104
> >  [ 1100.812302][    C3]  el1_abort+0x3c/0x5c
> >  [ 1100.812307][    C3]  el1h_64_sync_handler+0x54/0x90
> >  [ 1100.812313][    C3]  el1h_64_sync+0x68/0x6c
> >  [ 1100.812317][    C3]  __udp_gso_segment+0x298/0x4d4
> >  [ 1100.812322][    C3]  udp4_ufo_fragment+0x130/0x174
> >  [ 1100.812326][    C3]  inet_gso_segment+0x164/0x330
> >  [ 1100.812330][    C3]  skb_mac_gso_segment+0xc4/0x13c
> >  [ 1100.812335][    C3]  __skb_gso_segment+0xc4/0x120
> >  [ 1100.812339][    C3]  udp_rcv_segment+0x50/0x134
> >  [ 1100.812344][    C3]  udp_queue_rcv_skb+0x74/0x114
> >  [ 1100.812348][    C3]  udp_unicast_rcv_skb+0x94/0xac
> >  [ 1100.812353][    C3]  __udp4_lib_rcv+0x3e0/0x818
> >  [ 1100.812358][    C3]  udp_rcv+0x20/0x30
> >  [ 1100.812362][    C3]  ip_protocol_deliver_rcu+0x194/0x368
> >  [ 1100.812368][    C3]  ip_local_deliver+0xe4/0x184
> >  [ 1100.812373][    C3]  ip_rcv+0x90/0x118
> >  [ 1100.812378][    C3]  __netif_receive_skb+0x74/0x124
> >  [ 1100.812383][    C3]  process_backlog+0xd8/0x18c
> >  [ 1100.812388][    C3]  __napi_poll+0x5c/0x1fc
> >  [ 1100.812392][    C3]  net_rx_action+0x150/0x334
> >  [ 1100.812397][    C3]  __do_softirq+0x120/0x3f4
> >  [ 1100.812401][    C3]  ____do_softirq+0x10/0x20
> >  [ 1100.812405][    C3]  call_on_irq_stack+0x3c/0x74
> >  [ 1100.812410][    C3]  do_softirq_own_stack+0x1c/0x2c
> >  [ 1100.812414][    C3]  __irq_exit_rcu+0x5c/0xd4
> >  [ 1100.812418][    C3]  irq_exit_rcu+0x10/0x1c
> >  [ 1100.812422][    C3]  el1_interrupt+0x38/0x58
> >  [ 1100.812428][    C3]  el1h_64_irq_handler+0x18/0x24
> >  [ 1100.812434][    C3]  el1h_64_irq+0x68/0x6c
> >  [ 1100.812437][    C3]  arch_local_irq_enable+0x4/0x8
> >  [ 1100.812443][    C3]  cpuidle_enter+0x38/0x54
> >  [ 1100.812449][    C3]  do_idle+0x198/0x294
> >  [ 1100.812454][    C3]  cpu_startup_entry+0x34/0x3c
> >  [ 1100.812459][    C3]  secondary_start_kernel+0x138/0x158
> >  [ 1100.812465][    C3]  __secondary_switched+0xc0/0xc4
> > 
> > > > This packet's frag list is null while gso_type is not 0. Then
> it is
> > > treated
> > > > as a GRO-ed packet and sent to segment frag list. Function call
> > > path is
> > > > udp_rcv_segment => config features value
> > > >     __udpv4_gso_segment  => skb_gso_ok returns false. Here it
> > > should be
> > > >                             true. 
> > > 
> > > Why? If I read correctly the above, this is GSO packet landing in
> an
> > > UDP socket with no UDP_GRO sockopt. The packet is expected to be
> > > segmented again.
> > > 
> > Yes, it is GSO packet, however the fragment list of this GSO packet
> > becomes NULL. As the occurrence rate is very low, we really don’t
> know
> > why and when it becomes to be NULL. It happens both in cellular and
> > wlan network and seems an unknown kernel issue.
> >
> > To avoid crash the packet should skip to be segmented when fraglist
> is
> > null.
> > 
> > > >Failed reason is features doesn't
> > > match
> > > >                             gso_type.
> > > >         __udp_gso_segment_list
> > > >             skb_segment_list => packet is linear with skb->next 
> =
> > > NULL
> > > >             __udpv4_gso_segment_list_csum => use skb->next
> directly
> > > and
> > > >                                              crash happens
> > > > 
> > > > In rx-gro-list GRO-ed packet is set gso type as
> > > > NETIF_F_GSO_UDP_L4 | NETIF_F_GSO_FRAGLIST in napi_gro_complete.
> In
> > > gso
> > > > flow the features should also set them to match with gso_type.
> Or
> > > else it
> > > > will always return false in skb_gso_ok. Then it can't discover
> the
> > > > untrusted source packet and result crash in following function.
> > > 
> > > What is the 'untrusted source' here? I read the above as the
> packet
> > > aggregation happened in the GRO engine???
> > > 
> > > Could you please give a complete description of the relevant
> > > scenario?
> > > 
> > 
> > According to the backtrace info, we infer it is a rx-frag_list GRO
> 
> It would be helpful to see an skb_dump. But if this happens rarely in
> production, understood if that is not feasible.
> 
> The packet arrives on process_backlog, so still not sure how it is
> produced.
> 
Yes, it rarely happens. It is very hard to debug it and not sure its
produced path.

> > packet. Before sending into the UDP socket with no UDP_GRO sockopt,
> it
> > seems enter "skb_condense" to trim it and loose his frag list.
> However
> > it still keeps gso_type and gso_size. Then it continues to do
> > skb_segment_list.
> > 
> > First crash happens in skb_segment_list. 
> > This patch resolves the crash and lets the packet becomes a skb
> without
> > skb->next:
> > https://lore.kernel.org/all/Y9gt5EUizK1UImEP@debian/
> > Then crash moves to __udp_gso_sement_list ->
> skb_segment_list(finish)
> > -> __udpv4_gso_segment_list_csum, it uses skb->next without check
> then
> > crash.
> > 
> > 
> > What we want to do is to drop this abnormal packet.
> 
> I think we want to deliver this packet if possible.
> 
> Thanks for the added context. So this is assumed to be a GSO skb with
> SKB_GSO_FRAGLIST that somewhere lots its fraglist? That is the bug
> if true.
> 
> You are suggesting that this happens in the skb_condense in
> __udp_enqueue_schedule_skb?
> 
We try to add a skb_condense directly before skb_segment_list then get
a similar KE backtrace and skb dump value same with this issue skb
dump. 
However we don't know which condition trigger the flow runinto
skb_condense.

> If generated by GRO then on a device that has NETIF_F_GRO_FRAGLIST
> set.
> So one workaround (not fix) is to disable that.
> 
As we met other issue previously with GRO in
skb_segment(disalbe NETIF_F_GRO_FRAGLIST flow), it is still not safe to
disable GRO fraglist.

We hope current patch could be applied to drop the invalid packet.

> > So we set features
> > NETIF_F_GSO_UDP_L4 |NETIF_F_GSO_FRAGLIST to match fixes:
> f2696099c6c6
> > condation then drop it. 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ