lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAD=FV=Vmk1jA+dAgJNVDMtxrhhrPxgnXkNxiqJXWBvgUcZZUxQ@mail.gmail.com>
Date: Sat, 24 Jan 2026 15:36:01 -0800
From: Doug Anderson <dianders@...omium.org>
To: Qiliang Yuan <realwujing@...il.com>
Cc: akpm@...ux-foundation.org, lihuafei1@...wei.com, 
	linux-kernel@...r.kernel.org, mingo@...nel.org, mm-commits@...r.kernel.org, 
	song@...nel.org, stable@...r.kernel.org, sunshx@...natelecom.cn, 
	thorsten.blum@...ux.dev, wangjinchao600@...il.com, yangyicong@...ilicon.com, 
	yuanql9@...natelecom.cn, zhangjn11@...natelecom.cn
Subject: Re: [PATCH v3] watchdog/hardlockup: Fix UAF in perf event cleanup due
 to migration race

Hi,

On Fri, Jan 23, 2026 at 10:57 PM Qiliang Yuan <realwujing@...il.com> wrote:
>
> Thanks for the detailed review!
>
> > Wait a second... The above function hasn't existed for 2.5 years. It
> > was removed in commit d9b3629ade8e ("watchdog/hardlockup: have the
> > perf hardlockup use __weak functions more cleanly"). All that's left
> > in the ToT kernel referencing that function is an old comment...
> >
> > Oh, and I guess I can see below that your stack traces are on 4.19,
> > which is ancient! Things have changed a bit in the meantime. Are you
> > certain that the problem still reproduces on ToT?
>
> The function hardlockup_detector_perf_init() was renamed to
> watchdog_hardlockup_probe() in commit d9b3629ade8e ("watchdog/hardlockup:
> have the perf hardlockup use __weak functions more cleanly").
> Additionally, the source file was moved from kernel/watchdog_hld.c to
> kernel/watchdog_perf.c in commit 6ea0d04211a7. The v3 commit message
> inadvertently retained legacy terminology from the 4.19 kernel; this will
> be updated in V4 to reflect current ToT naming.
>
> The core logic remains the same: the race condition persists despite the
> renaming and cleanup of the __weak function logic.
>
> Regarding ToT reproducibility: while the KASAN report originated from
> 4.19, the underlying logic is still problematic in ToT. In
> watchdog_hardlockup_probe(), the call to
> hardlockup_detector_event_create() still writes to the per-cpu
> watchdog_ev. Task migration between event creation and the subsequent
> perf_event_release_kernel() leaves a stale pointer in the watchdog_ev of
> the original CPU.
>
> > Probably want a "Fixes" tag? If I had to guess, maybe?
> >
> > Fixes: 930d8f8dbab9 ("watchdog/perf: adapt the watchdog_perf interface
> > for async model")
>
> Commit 930d8f8dbab9 introduced the async initialization which allows
> preemption/migration during the probe phase. This tag will be included in
> V4.

The part that doesn't make a lot of sense to me, though, is that v4.19
also doesn't have commit 930d8f8dbab9 ("watchdog/perf: adapt the
watchdog_perf interface for async model"), which is where we are
saying the problem was introduced.

...so in v4.19 I think:
* hardlockup_detector_perf_init() is only called from watchdog_nmi_probe()
* watchdog_nmi_probe() is only called from lockup_detector_init()
* lockup_detector_init() is only called from kernel_init_freeable()
right before smp_init()

Thus I'm super confused about how you could have seen the problem on
v4.19. Maybe your v4.19 kernel has some backported patches that makes
this possible?

While I'm not saying that the v4 patch you just posted is incorrect,
I'm just trying to make sure that:

1. We actually understand the problem you were seeing.

2. We are identifying the correct "Fixes" commit.


> > I'm still a bit confused why this warning didn't trigger previously.
> > Do you know why?
>
> In 4.19, hardlockup_detector_event_create() did not include the
> WARN_ON(!is_percpu_thread()) check, which was added in later versions. In
> ToT, this warning is expected to trigger if watchdog_hardlockup_probe()
> is called from a non-per-cpu-bound thread (such as kernel_init). This
> further justifies refactoring the creation logic to be CPU-agnostic for
> probing.

OK, fair enough. ...but I'm a bit curious why nobody else saw this
WARN_ON(). I'm also curious if you have tested the hardlockup detector
on newer kernels, or if all of your work has been done on 4.19. If all
your work has been done on 4.19, do we need to find someone to test
your patch on a newer kernel and make sure it works OK? If you've
tested on a newer kernel, did the hardlockup detector init from the
kernel's early-init code, or the retry code?

-Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ