lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <878qodwlzw.ffs@tglx>
Date: Sun, 06 Apr 2025 13:46:43 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: I Hsin Cheng <richard120310@...il.com>,
 syzbot+d5e61dcfda08821a226d@...kaller.appspotmail.com
Cc: anna-maria@...utronix.de, frederic@...nel.org,
 linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
 linux-kernel-mentees@...ts.linux.dev, skhan@...uxfoundation.org, Alexander
 Potapenko <glider@...gle.com>, Marco Elver <elver@...gle.com>, Dmitry
 Vyukov <dvyukov@...gle.com> 
Subject: Re: [RFC PATCH RESEND] timerqueue: Complete rb_node initialization
 within timerqueue_init

On Sat, Apr 05 2025 at 16:05, I. Hsin Cheng wrote:
> The children of "node" within "struct timerqueue_node" may be uninit
> status after the initialization. Initialize them as NULL under
> timerqueue_init to prevent the problem.

Which problem?

It's completely sufficient to use RB_INIT_NODE() on initialization.

As you did not provide a link and no explanation, I had to waste some
time to search though the syzbot site and looked at the actual issue:

BUG: KMSAN: uninit-value in rb_next+0x200/0x210 lib/rbtree.c:505
 rb_next+0x200/0x210 lib/rbtree.c:505
 rb_erase_cached include/linux/rbtree.h:124 [inline]
 timerqueue_del+0xee/0x1a0 lib/timerqueue.c:57
 __remove_hrtimer kernel/time/hrtimer.c:1123 [inline]
 __run_hrtimer kernel/time/hrtimer.c:1771 [inline]
 __hrtimer_run_queues+0x3b7/0xe40 kernel/time/hrtimer.c:1855
 hrtimer_interrupt+0x41b/0xb10 kernel/time/hrtimer.c:1917
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1038 [inline]
 __sysvec_apic_timer_interrupt+0xa7/0x420 arch/x86/kernel/apic/apic.c:1055
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
 sysvec_apic_timer_interrupt+0x7e/0x90 arch/x86/kernel/apic/apic.c:1049

So this code removes a queued timer from the RB tree and that KMSAN
warning happens in rb_next(), which is invoked from rb_erase_cached().

The issue happens in lib/rbtree.c:505

505:    while (node->rb_left)
506:          node = node->rb_left;

which is walking the tree down left. So that means it hits a pointer
which points to uninitialized memory.

All timers are queued with rb_add_cached(), which calls rb_link_node()
and that does:

    node->rb_left = node->rb_right = NULL;

Which means there can't be a timer enqueued in the RB tree which has
rb_left/right uninitialized.

So how does this end up at uninitialized memory? There are two
obvious explanations:

    1) A stray pointer corrupts the RB tree

    2) A queued timer has been freed

So what would this "initialization" help? Nothing at all.

We are not adding some random pointless initialization to paper
over a problem which is absolutely not understood.

Thanks,

        tglx



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ