lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241224092938.GC171473@unreal>
Date: Tue, 24 Dec 2024 11:29:38 +0200
From: Leon Romanovsky <leon@...nel.org>
To: Lin Ma <linma@....edu.cn>
Cc: jgg@...pe.ca, cmeiohas@...dia.com, michaelgur@...dia.com,
	linux-rdma@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [bug report] RDMA/iwpm: reentrant iwpm hello message

On Fri, Dec 20, 2024 at 11:32:34PM +0800, Lin Ma wrote:
> Hello maintainers,
> 
> Our fuzzer identified one interesting reentrant bug that could cause hang
> in the kernel. The crash log is like below:
> 
> [   32.616575][ T2983]
> [   32.617000][ T2983] ============================================
> [   32.617879][ T2983] WARNING: possible recursive locking detected
> [   32.618759][ T2983] 6.1.70 #1 Not tainted
> [   32.619362][ T2983] --------------------------------------------
> [   32.620248][ T2983] hello.elf/2983 is trying to acquire lock:
> [   32.621084][ T2983] ffffffff91978ff8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0x30f/0x990
> [   32.624234][ T2983]
> [   32.624234][ T2983] but task is already holding lock:
> [   32.625237][ T2983] ffffffff91978ff8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0x30f/0x990
> [   32.626562][ T2983]
> [   32.626562][ T2983] other info that might help us debug this:
> [   32.627648][ T2983]  Possible unsafe locking scenario:
> [   32.627648][ T2983]
> [   32.633708][ T2983]        CPU0
> [   32.634184][ T2983]        ----
> [   32.634646][ T2983]   lock(&rdma_nl_types[idx].sem);
> [   32.635433][ T2983]   lock(&rdma_nl_types[idx].sem);
> [   32.636155][ T2983]
> [   32.636155][ T2983]  *** DEADLOCK ***
> [   32.636155][ T2983]
> [   32.637236][ T2983]  May be due to missing lock nesting notation
> [   32.637236][ T2983]
> [   32.638408][ T2983] 2 locks held by hello.elf/2983:
> [   32.639135][ T2983]  #0: ffffffff91978ff8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0x30f/0x990
> [   32.640605][ T2983]  #1: ffff888103f8f690 (nlk_cb_mutex-RDMA){+.+.}-{3:3}, at: netlink_dump+0xd3/0xc60
> [   32.641981][ T2983]
> [   32.641981][ T2983] stack backtrace:
> [   32.642833][ T2983] CPU: 0 PID: 2983 Comm: hello.elf Not tainted 6.1.70 #1
> [   32.643830][ T2983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> [   32.645243][ T2983] Call Trace:
> [   32.645735][ T2983]  <TASK>
> [   32.646197][ T2983]  dump_stack_lvl+0x177/0x231
> [   32.646901][ T2983]  ? nf_tcp_handle_invalid+0x605/0x605
> [   32.647705][ T2983]  ? panic+0x725/0x725
> [   32.648350][ T2983]  validate_chain+0x4dd0/0x6010
> [   32.649080][ T2983]  ? reacquire_held_locks+0x5a0/0x5a0
> [   32.649864][ T2983]  ? mark_lock+0x94/0x320
> [   32.650506][ T2983]  ? lockdep_hardirqs_on_prepare+0x3fd/0x760
> [   32.651376][ T2983]  ? print_irqtrace_events+0x210/0x210
> [   32.652182][ T2983]  ? mark_lock+0x94/0x320
> [   32.652825][ T2983]  __lock_acquire+0x12ad/0x2010
> [   32.653541][ T2983]  lock_acquire+0x1b4/0x490
> [   32.654211][ T2983]  ? rdma_nl_rcv+0x30f/0x990
> [   32.654891][ T2983]  ? __might_sleep+0xd0/0xd0
> [   32.655569][ T2983]  ? __lock_acquire+0x12ad/0x2010
> [   32.656316][ T2983]  ? read_lock_is_recursive+0x10/0x10
> [   32.657109][ T2983]  down_read+0x42/0x2d0
> [   32.657723][ T2983]  ? rdma_nl_rcv+0x30f/0x990
> [   32.658400][ T2983]  rdma_nl_rcv+0x30f/0x990
> [   32.659132][ T2983]  ? rdma_nl_net_init+0x160/0x160
> [   32.659847][ T2983]  ? netlink_lookup+0x30/0x200
> [   32.660519][ T2983]  ? __netlink_lookup+0x2a/0x6d0
> [   32.661214][ T2983]  ? netlink_lookup+0x30/0x200
> [   32.661880][ T2983]  ? netlink_lookup+0x30/0x200
> [   32.662545][ T2983]  netlink_unicast+0x74b/0x8c0
> [   32.663215][ T2983]  rdma_nl_unicast+0x4b/0x60
> [   32.663852][ T2983]  iwpm_send_hello+0x1d8/0x350
> [   32.664525][ T2983]  ? iwpm_mapinfo_available+0x130/0x130
> [   32.665295][ T2983]  ? iwpm_parse_nlmsg+0x124/0x260
> [   32.665995][ T2983]  iwpm_hello_cb+0x1e1/0x2e0
> [   32.666638][ T2983]  ? netlink_dump+0x236/0xc60
> [   32.667294][ T2983]  ? iwpm_mapping_error_cb+0x3e0/0x3e0
> [   32.668064][ T2983]  netlink_dump+0x592/0xc60
> [   32.668706][ T2983]  ? netlink_lookup+0x200/0x200
> [   32.669381][ T2983]  ? __netlink_lookup+0x2a/0x6d0
> [   32.670073][ T2983]  ? netlink_lookup+0x30/0x200
> [   32.670731][ T2983]  ? netlink_lookup+0x30/0x200
> [   32.671411][ T2983]  __netlink_dump_start+0x54e/0x710
> [   32.672220][ T2983]  rdma_nl_rcv+0x753/0x990
> [   32.672846][ T2983]  ? rdma_nl_net_init+0x160/0x160
> [   32.673538][ T2983]  ? iwpm_mapping_error_cb+0x3e0/0x3e0
> [   32.674316][ T2983]  ? netlink_deliver_tap+0x2e/0x1b0
> [   32.675106][ T2983]  ? net_generic+0x1e/0x240
> [   32.675778][ T2983]  ? netlink_deliver_tap+0x2e/0x1b0
> [   32.676553][ T2983]  netlink_unicast+0x74b/0x8c0
> [   32.677262][ T2983]  netlink_sendmsg+0x882/0xb90
> [   32.677969][ T2983]  ? netlink_getsockopt+0x550/0x550
> [   32.678732][ T2983]  ? aa_sock_msg_perm+0x94/0x150
> [   32.679465][ T2983]  ? bpf_lsm_socket_sendmsg+0x5/0x10
> [   32.680243][ T2983]  ? security_socket_sendmsg+0x7c/0xa0
> [   32.681048][ T2983]  __sys_sendto+0x456/0x5b0
> [   32.681724][ T2983]  ? __ia32_sys_getpeername+0x80/0x80
> [   32.682510][ T2983]  ? __lock_acquire+0x2010/0x2010
> [   32.683241][ T2983]  ? lockdep_hardirqs_on_prepare+0x3fd/0x760
> [   32.684134][ T2983]  ? fd_install+0x5c/0x4f0
> [   32.684794][ T2983]  ? print_irqtrace_events+0x210/0x210
> [   32.685608][ T2983]  __x64_sys_sendto+0xda/0xf0
> [   32.686298][ T2983]  do_syscall_64+0x45/0x90
> [   32.686955][ T2983]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   32.687810][ T2983] RIP: 0033:0x440624
> [   32.691944][ T2983] RSP: 002b:00007ffc82f3ee48 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
> [   32.693136][ T2983] RAX: ffffffffffffffda RBX: 0000000000400400 RCX: 0000000000440624
> [   32.694264][ T2983] RDX: 0000000000000018 RSI: 00007ffc82f3ee80 RDI: 0000000000000003
> [   32.695387][ T2983] RBP: 00007ffc82f3fe90 R08: 000000000047df08 R09: 000000000000000c
> [   32.696621][ T2983] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000403990
> [   32.697693][ T2983] R13: 0000000000000000 R14: 00000000006a6018 R15: 0000000000000000
> [   32.698774][ T2983]  </TASK>
> 
> In a nutshell, the callback function for the command RDMA_NL_IWPM_HELLO, iwpm_hello_cb,
> can further call rdma_nl_unicast, leading to repeated calls that may cause
> a deadlock and potentially harm the kernel.
> 
> I am not familiar with the internal workings of the callback mechanism or how
> IWPMD utilizes it, so I'm uncertain whether this reentrancy is expected behavior.
> If it is, perhaps a reference counter should be used instead of an rw_semaphore.
> If not, a proper check should be implemented.

I'm not fully understand the lockdep here. We use down_read(), which is
reentry safe.

Thanks

> 
> Regards,
> Lin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ