[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zl4jPImmEeRuYQjz@slm.duckdns.org>
Date: Mon, 3 Jun 2024 10:10:36 -1000
From: Tejun Heo <tj@...nel.org>
To: Leon Romanovsky <leon@...nel.org>
Cc: Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>, linux-kernel@...r.kernel.org,
Gal Pressman <gal@...dia.com>, Tariq Toukan <tariqt@...dia.com>,
RDMA mailing list <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH -rc] workqueue: Reimplement UAF fix to avoid lockdep
worning
Hello, again, Leon.
Re-reading the warning, I'm not sure this is a bug on workqueue side.
On Fri, May 31, 2024 at 06:48:51AM +0300, Leon Romanovsky wrote:
> [ 1233.554381] ==================================================================
> [ 1233.555215] BUG: KASAN: slab-use-after-free in lockdep_register_key+0x707/0x810
> [ 1233.555983] Read of size 8 at addr ffff88811f1d8928 by task test-ovs-bond-m/10149
> [ 1233.556774]
> [ 1233.557020] CPU: 0 PID: 10149 Comm: test-ovs-bond-m Not tainted 6.10.0-rc1_external_1613e604df0c #1
> [ 1233.557951] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [ 1233.559044] Call Trace:
> [ 1233.559367] <TASK>
> [ 1233.559653] dump_stack_lvl+0x7e/0xc0
> [ 1233.560078] print_report+0xc1/0x600
> [ 1233.561975] kasan_report+0xb9/0xf0
> [ 1233.562872] lockdep_register_key+0x707/0x810
> [ 1233.564799] alloc_workqueue+0x466/0x1800
> [ 1233.567627] mlx5_pagealloc_init+0x7d/0x180 [mlx5_core]
> [ 1233.568322] mlx5_mdev_init+0x482/0xad0 [mlx5_core]
> [ 1233.569387] probe_one+0x11d/0xc80 [mlx5_core]
So, this is saying that alloc_workqueue() allocated a name during lockdep
initialization. This is before pwq init or anything else complicated
happening. It just allocated the workqueue struct and called into
lockep_register_key(&wq->key).
> [ 1233.599979] Allocated by task 9589:
> [ 1233.600382] kasan_save_stack+0x20/0x40
> [ 1233.600828] kasan_save_track+0x10/0x30
> [ 1233.601265] __kasan_kmalloc+0x77/0x90
> [ 1233.601696] kernfs_iop_get_link+0x61/0x5a0
> [ 1233.602181] vfs_readlink+0x1ab/0x320
> [ 1233.602605] do_readlinkat+0x1cb/0x290
> [ 1233.602610] __x64_sys_readlinkat+0x92/0xf0
> [ 1233.602612] do_syscall_64+0x6d/0x140
> [ 1233.605196] entry_SYSCALL_64_after_hwframe+0x4b/0x53
> [ 1233.605731]
> [ 1233.605986] Freed by task 9589:
> [ 1233.606373] kasan_save_stack+0x20/0x40
> [ 1233.606801] kasan_save_track+0x10/0x30
> [ 1233.607232] kasan_save_free_info+0x37/0x50
> [ 1233.607695] poison_slab_object+0x10c/0x190
> [ 1233.608161] __kasan_slab_free+0x11/0x30
> [ 1233.608604] kfree+0x11b/0x340
> [ 1233.608970] vfs_readlink+0x120/0x320
> [ 1233.609413] do_readlinkat+0x1cb/0x290
> [ 1233.609849] __x64_sys_readlinkat+0x92/0xf0
> [ 1233.610308] do_syscall_64+0x6d/0x140
> [ 1233.610741] entry_SYSCALL_64_after_hwframe+0x4b/0x53
And KASAN is reporting use-after-free on a completely unrelated VFS object.
I can't tell for sure from the logs alone but lockdep_register_key()
iterates entries in the hashtable trying to find whether the key is a
duplicate and it could be that that walk is triggering the use-after-free
warning. If so, it doesn't really have much to do with workqueue. The
corruption happened elsewhere and workqueue just happens to traverse the
hashtable afterwards.
Thanks.
--
tejun
Powered by blists - more mailing lists