[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20241220141913.1346301-1-cascardo@igalia.com>
Date: Fri, 20 Dec 2024 11:19:13 -0300
From: Thadeu Lima de Souza Cascardo <cascardo@...lia.com>
To: gfs2@...ts.linux.dev
Cc: Alexander Aring <aahringo@...hat.com>,
David Teigland <teigland@...hat.com>,
linux-kernel@...r.kernel.org,
kernel-dev@...lia.com,
Thadeu Lima de Souza Cascardo <cascardo@...lia.com>
Subject: [PATCH] dlm: prevent NPD when writing a positive value to event_done
do_uevent returns the value written to event_done. In case it is a positive
value, new_lockspace would undo all the work, and lockspace would not be
set. __dlm_new_lockspace, however, would treat that positive value as a
success due to commit 8511a2728ab8 ("dlm: fix use count with multiple
joins").
Down the line, device_create_lockspace would pass that NULL lockspace to
dlm_find_lockspace_local, leading to a NULL pointer dereference:
[ 1130.159339] BUG: kernel NULL pointer dereference, address: 0000000000000020
[ 1130.160015] #PF: supervisor write access in kernel mode
[ 1130.160242] #PF: error_code(0x0002) - not-present page
[ 1130.160242] PGD 0 P4D 0
[ 1130.160242] Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 1130.160242] CPU: 4 UID: 0 PID: 213 Comm: dlm Not tainted 6.13.0-rc3-00077-gcbfcf8f40ff5 #318 0df1a232d58f3206f4f481ddfc0108c8b3da89be
[ 1130.160242] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1130.160242] RIP: 0010:dlm_find_lockspace_local+0x8/0x20
[ 1130.160242] Code: 75 f3 f0 ff 43 20 eb 02 31 db 48 c7 c7 50 d6 85 b0 e8 4c aa 58 01 48 89 d8 5b 5d c3 cc cc cc cc 66 90 0f 1f 44 00 00 48 89 f8 <f0> ff 47 20 c3 cc cc cc cc 66 66 66 66 66 66 2e 0f 1f 84 00 00 00
[ 1130.160242] RSP: 0018:ffffa4bbc0637b08 EFLAGS: 00010246
[ 1130.160242] RAX: 0000000000000000 RBX: ffff8c8fc3cbab18 RCX: 0000000000000000
[ 1130.160242] RDX: 0000000000000006 RSI: ffffffffaeb012ce RDI: 0000000000000000
[ 1130.160242] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[ 1130.160242] R10: 0000000000000000 R11: ffffffffac61e550 R12: ffff8c8fc34b5000
[ 1130.160242] R13: 000000000000007d R14: 0000000000000000 R15: ffff8c8fc3cbab00
[ 1130.160242] FS: 000078790ad92740(0000) GS:ffff8c903dd00000(0000) knlGS:0000000000000000
[ 1130.160242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1130.160242] CR2: 0000000000000020 CR3: 000000000518a001 CR4: 0000000000770eb0
[ 1130.160242] PKRU: 55555554
[ 1130.160242] Call Trace:
[ 1130.160242] <TASK>
[ 1130.160242] ? __die_body+0x64/0xb0
[ 1130.160242] ? page_fault_oops+0x3eb/0x490
[ 1130.160242] ? exc_page_fault+0x4f2/0x6d0
[ 1130.160242] ? asm_exc_page_fault+0x22/0x30
[ 1130.160242] ? stack_trace_save+0x70/0x70
[ 1130.160242] ? dlm_find_lockspace_local+0x8/0x20
[ 1130.160242] device_create_lockspace+0x7c/0x180
[ 1130.160242] device_write+0x252/0x310
[ 1130.160242] vfs_write+0xe3/0x350
[ 1130.160242] ksys_write+0x74/0xe0
[ 1130.160242] do_syscall_64+0x87/0x100
[ 1130.160242] ? _raw_spin_unlock_irqrestore+0x3d/0x60
[ 1130.160242] ? __slab_free+0x2b6/0x320
[ 1130.160242] ? do_sys_openat2+0xae/0xe0
[ 1130.160242] ? kmem_cache_free+0x146/0x380
[ 1130.160242] ? do_syscall_64+0x93/0x100
[ 1130.160242] ? lockdep_hardirqs_on+0x95/0x140
[ 1130.160242] ? syscall_exit_to_user_mode+0x1f0/0x270
[ 1130.160242] ? do_syscall_64+0x93/0x100
[ 1130.160242] ? do_pte_missing+0xfd/0x1410
[ 1130.160242] ? do_pte_missing+0x1e7/0x1410
[ 1130.160242] ? handle_mm_fault+0x7ad/0xa30
[ 1130.160242] ? reacquire_held_locks+0x124/0x1c0
[ 1130.160242] ? vma_end_read+0x12/0xe0
[ 1130.160242] ? exc_page_fault+0x49e/0x6d0
[ 1130.160242] ? lockdep_hardirqs_on+0x95/0x140
[ 1130.160242] ? irqentry_exit_to_user_mode+0x1a7/0x1f0
[ 1130.160242] ? exc_page_fault+0x49e/0x6d0
[ 1130.160242] entry_SYSCALL_64_after_hwframe+0x55/0x5d
[ 1130.160242] RIP: 0033:0x78790aeb0214
[ 1130.160242] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 55 b1 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[ 1130.160242] RSP: 002b:00007ffc0309b5e8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 1130.160242] RAX: ffffffffffffffda RBX: 00007ffc0309b7c8 RCX: 000078790aeb0214
[ 1130.160242] RDX: 000000000000007d RSI: 00007ffc0309b610 RDI: 0000000000000003
[ 1130.160242] RBP: 00007ffc0309b6b0 R08: 0000000000000000 R09: 000078790afd1210
[ 1130.160242] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[ 1130.160242] R13: 00007ffc0309b7d8 R14: 000056989ab4bd98 R15: 000078790b003000
[ 1130.160242] </TASK>
[ 1130.160242] Modules linked in: mousedev
[ 1130.160242] CR2: 0000000000000020
[ 1130.160242] ---[ end trace 0000000000000000 ]---
[ 1130.160242] RIP: 0010:dlm_find_lockspace_local+0x8/0x20
[ 1130.160242] Code: 75 f3 f0 ff 43 20 eb 02 31 db 48 c7 c7 50 d6 85 b0 e8 4c aa 58 01 48 89 d8 5b 5d c3 cc cc cc cc 66 90 0f 1f 44 00 00 48 89 f8 <f0> ff 47 20 c3 cc cc cc cc 66 66 66 66 66 66 2e 0f 1f 84 00 00 00
[ 1130.160242] RSP: 0018:ffffa4bbc0637b08 EFLAGS: 00010246
[ 1130.160242] RAX: 0000000000000000 RBX: ffff8c8fc3cbab18 RCX: 0000000000000000
[ 1130.160242] RDX: 0000000000000006 RSI: ffffffffaeb012ce RDI: 0000000000000000
[ 1130.160242] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[ 1130.160242] R10: 0000000000000000 R11: ffffffffac61e550 R12: ffff8c8fc34b5000
[ 1130.160242] R13: 000000000000007d R14: 0000000000000000 R15: ffff8c8fc3cbab00
[ 1130.160242] FS: 000078790ad92740(0000) GS:ffff8c903dd00000(0000) knlGS:0000000000000000
[ 1130.160242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1130.160242] CR2: 0000000000000020 CR3: 000000000518a001 CR4: 0000000000770eb0
[ 1130.160242] PKRU: 55555554
[ 1130.160242] Kernel panic - not syncing: Fatal exception
[ 1130.160242] Kernel Offset: 0x2b400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Treating such positive values as successes prevents the problem. Given this
has been broken for so long, this is unlikely to break userspace
expectations.
Fixes: 8511a2728ab8 ("dlm: fix use count with multiple joins")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@...lia.com>
---
fs/dlm/lockspace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 8afac6e2dff0..1929327ffbe1 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -576,7 +576,7 @@ static int new_lockspace(const char *name, const char *cluster,
lockspace to start running (via sysfs) in dlm_ls_start(). */
error = do_uevent(ls, 1);
- if (error)
+ if (error < 0)
goto out_recoverd;
/* wait until recovery is successful or failed */
--
2.34.1
Powered by blists - more mailing lists