[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260127021711.1180952-1-realwujing@gmail.com>
Date: Mon, 26 Jan 2026 21:16:54 -0500
From: Qiliang Yuan <realwujing@...il.com>
To: dianders@...omium.org
Cc: akpm@...ux-foundation.org,
lihuafei1@...wei.com,
linux-kernel@...r.kernel.org,
mingo@...nel.org,
mm-commits@...r.kernel.org,
realwujing@...il.com,
song@...nel.org,
stable@...r.kernel.org,
sunshx@...natelecom.cn,
thorsten.blum@...ux.dev,
wangjinchao600@...il.com,
yangyicong@...ilicon.com,
yuanql9@...natelecom.cn,
zhangjn11@...natelecom.cn
Subject: Re: [PATCH v4] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race
Hi Doug,
Thanks for your insightful follow-up! It's great to have the openEuler vs. Mainline
timing differences clarified—it definitely explains why we hit this so reliably
in our downstream environment.
On Mon, Jan 26, 2026 at 5:14 PM Doug Anderson <dianders@...omium.org> wrote:
> OK, so I think the answer is: you haven't actually seen the problem
> (or the WARN_ON) on a mainline kernel, only on the openEuler 4.19
> kernel...
>
> ...actually, I looked and now think the problem doesn't exist on a
> mainline kernel. Specificaly, when we run lockup_detector_retry_init()
> we call schedule_work() to do the work. That schedules work on the
> "system_percpu_wq". While the work ends up being queued with
> "WORK_CPU_UNBOUND", I believe that we still end up running on a thread
> that's bound to just one CPU in the end. This is presumably why
> nobody has reported that "WARN_ON(!is_percpu_thread())" actually
> hitting on mainline.
You are right that in the latest mainline, schedule_work() has been updated
to use 'system_percpu_wq'. However, in many LTS kernels (including 4.19),
schedule_work() still submits to 'system_wq', which lacks the per-cpu
guarantee.
More importantly, even on 'system_percpu_wq', the worker threads do not
carry the PF_PERCPU_THREAD flag. is_percpu_thread() specifically checks
(current->flags & PF_PERCPU_THREAD), which is reserved for kthreads
specifically pinned via kthread_create_on_cpu(). Therefore, the
WARN_ON(!is_percpu_thread()) in hardlockup_detector_event_create() is
still violated in the retry path even on mainline.
The UAF risk stems from the fact that preemption is enabled during the
probe. If the worker thread (even if on a per-cpu wq) is preempted or
if the logic assumes the task cannot migrate (which is_percpu_thread
usually guarantees), we have a logical gap. By making the probe path
stateless and using cpu_hotplug_disable(), we eliminate this dependency
entirely.
> If that's the case, we'd definitely want to at least change the
> description and presumably _remove_ the Fixes tag? I actually still
> think the code looks nicer after your CL and (maybe?) we could even
> remove the whole schedule_work() for running this code? Maybe it was
> only added to deal with this exact problem? ...but the CL description
> would definitely need to be updated.
The schedule_work() in lockup_detector_retry_init() (added by 930d8f8dbab9)
is necessary for platforms where the PMU or other dependencies aren't ready
during early init.
I agree that the commit description should be updated to clarify that
while the issue was caught in a downstream kernel with shifted init timings,
it identifies a latent race condition in the mainline retry path.
Regarding the 'Fixes' tag, since 930d8f8dbab9 introduced the asynchronous
retry path which calls the probe logic from a non-percpu-thread context,
it still seems like the appropriate target for the "root cause" of the
vulnerability.
I'll refactor the commit message in V5 to better reflect this context
and remove the emphasis on ToT being "broken" out-of-the-box (since early
init is indeed safe there).
How does that sound to you?
Best regards,
Qiliang
Powered by blists - more mailing lists