[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <878qodwlzw.ffs@tglx>
Date: Sun, 06 Apr 2025 13:46:43 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: I Hsin Cheng <richard120310@...il.com>,
syzbot+d5e61dcfda08821a226d@...kaller.appspotmail.com
Cc: anna-maria@...utronix.de, frederic@...nel.org,
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
linux-kernel-mentees@...ts.linux.dev, skhan@...uxfoundation.org, Alexander
Potapenko <glider@...gle.com>, Marco Elver <elver@...gle.com>, Dmitry
Vyukov <dvyukov@...gle.com>
Subject: Re: [RFC PATCH RESEND] timerqueue: Complete rb_node initialization
within timerqueue_init
On Sat, Apr 05 2025 at 16:05, I. Hsin Cheng wrote:
> The children of "node" within "struct timerqueue_node" may be uninit
> status after the initialization. Initialize them as NULL under
> timerqueue_init to prevent the problem.
Which problem?
It's completely sufficient to use RB_INIT_NODE() on initialization.
As you did not provide a link and no explanation, I had to waste some
time to search though the syzbot site and looked at the actual issue:
BUG: KMSAN: uninit-value in rb_next+0x200/0x210 lib/rbtree.c:505
rb_next+0x200/0x210 lib/rbtree.c:505
rb_erase_cached include/linux/rbtree.h:124 [inline]
timerqueue_del+0xee/0x1a0 lib/timerqueue.c:57
__remove_hrtimer kernel/time/hrtimer.c:1123 [inline]
__run_hrtimer kernel/time/hrtimer.c:1771 [inline]
__hrtimer_run_queues+0x3b7/0xe40 kernel/time/hrtimer.c:1855
hrtimer_interrupt+0x41b/0xb10 kernel/time/hrtimer.c:1917
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1038 [inline]
__sysvec_apic_timer_interrupt+0xa7/0x420 arch/x86/kernel/apic/apic.c:1055
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
sysvec_apic_timer_interrupt+0x7e/0x90 arch/x86/kernel/apic/apic.c:1049
So this code removes a queued timer from the RB tree and that KMSAN
warning happens in rb_next(), which is invoked from rb_erase_cached().
The issue happens in lib/rbtree.c:505
505: while (node->rb_left)
506: node = node->rb_left;
which is walking the tree down left. So that means it hits a pointer
which points to uninitialized memory.
All timers are queued with rb_add_cached(), which calls rb_link_node()
and that does:
node->rb_left = node->rb_right = NULL;
Which means there can't be a timer enqueued in the RB tree which has
rb_left/right uninitialized.
So how does this end up at uninitialized memory? There are two
obvious explanations:
1) A stray pointer corrupts the RB tree
2) A queued timer has been freed
So what would this "initialization" help? Nothing at all.
We are not adding some random pointless initialization to paper
over a problem which is absolutely not understood.
Thanks,
tglx
Powered by blists - more mailing lists