linux-kernel - Re: [PATCH -rc] workqueue: Reimplement UAF fix to avoid lockdep worning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zl4jPImmEeRuYQjz@slm.duckdns.org>
Date: Mon, 3 Jun 2024 10:10:36 -1000
From: Tejun Heo <tj@...nel.org>
To: Leon Romanovsky <leon@...nel.org>
Cc: Lai Jiangshan <jiangshanlai@...il.com>,
	Zqiang <qiang.zhang1211@...il.com>, linux-kernel@...r.kernel.org,
	Gal Pressman <gal@...dia.com>, Tariq Toukan <tariqt@...dia.com>,
	RDMA mailing list <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH -rc] workqueue: Reimplement UAF fix to avoid lockdep
 worning

Hello, again, Leon.

Re-reading the warning, I'm not sure this is a bug on workqueue side.

On Fri, May 31, 2024 at 06:48:51AM +0300, Leon Romanovsky wrote:
>  [ 1233.554381] ==================================================================
>  [ 1233.555215] BUG: KASAN: slab-use-after-free in lockdep_register_key+0x707/0x810
>  [ 1233.555983] Read of size 8 at addr ffff88811f1d8928 by task test-ovs-bond-m/10149
>  [ 1233.556774] 
>  [ 1233.557020] CPU: 0 PID: 10149 Comm: test-ovs-bond-m Not tainted 6.10.0-rc1_external_1613e604df0c #1
>  [ 1233.557951] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>  [ 1233.559044] Call Trace:
>  [ 1233.559367]  <TASK>
>  [ 1233.559653]  dump_stack_lvl+0x7e/0xc0
>  [ 1233.560078]  print_report+0xc1/0x600
>  [ 1233.561975]  kasan_report+0xb9/0xf0
>  [ 1233.562872]  lockdep_register_key+0x707/0x810
>  [ 1233.564799]  alloc_workqueue+0x466/0x1800
>  [ 1233.567627]  mlx5_pagealloc_init+0x7d/0x180 [mlx5_core]
>  [ 1233.568322]  mlx5_mdev_init+0x482/0xad0 [mlx5_core]
>  [ 1233.569387]  probe_one+0x11d/0xc80 [mlx5_core]

So, this is saying that alloc_workqueue() allocated a name during lockdep
initialization. This is before pwq init or anything else complicated
happening. It just allocated the workqueue struct and called into
lockep_register_key(&wq->key).

>  [ 1233.599979] Allocated by task 9589:
>  [ 1233.600382]  kasan_save_stack+0x20/0x40
>  [ 1233.600828]  kasan_save_track+0x10/0x30
>  [ 1233.601265]  __kasan_kmalloc+0x77/0x90
>  [ 1233.601696]  kernfs_iop_get_link+0x61/0x5a0
>  [ 1233.602181]  vfs_readlink+0x1ab/0x320
>  [ 1233.602605]  do_readlinkat+0x1cb/0x290
>  [ 1233.602610]  __x64_sys_readlinkat+0x92/0xf0
>  [ 1233.602612]  do_syscall_64+0x6d/0x140
>  [ 1233.605196]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
>  [ 1233.605731] 
>  [ 1233.605986] Freed by task 9589:
>  [ 1233.606373]  kasan_save_stack+0x20/0x40
>  [ 1233.606801]  kasan_save_track+0x10/0x30
>  [ 1233.607232]  kasan_save_free_info+0x37/0x50
>  [ 1233.607695]  poison_slab_object+0x10c/0x190
>  [ 1233.608161]  __kasan_slab_free+0x11/0x30
>  [ 1233.608604]  kfree+0x11b/0x340
>  [ 1233.608970]  vfs_readlink+0x120/0x320
>  [ 1233.609413]  do_readlinkat+0x1cb/0x290
>  [ 1233.609849]  __x64_sys_readlinkat+0x92/0xf0
>  [ 1233.610308]  do_syscall_64+0x6d/0x140
>  [ 1233.610741]  entry_SYSCALL_64_after_hwframe+0x4b/0x53

And KASAN is reporting use-after-free on a completely unrelated VFS object.
I can't tell for sure from the logs alone but lockdep_register_key()
iterates entries in the hashtable trying to find whether the key is a
duplicate and it could be that that walk is triggering the use-after-free
warning. If so, it doesn't really have much to do with workqueue. The
corruption happened elsewhere and workqueue just happens to traverse the
hashtable afterwards.

Thanks.

-- 
tejun