[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260128023757.1693269-1-realwujing@gmail.com>
Date: Tue, 27 Jan 2026 21:37:52 -0500
From: Qiliang Yuan <realwujing@...il.com>
To: dianders@...omium.org
Cc: akpm@...ux-foundation.org,
lihuafei1@...wei.com,
linux-kernel@...r.kernel.org,
mingo@...nel.org,
mm-commits@...r.kernel.org,
realwujing@...il.com,
song@...nel.org,
stable@...r.kernel.org,
sunshx@...natelecom.cn,
thorsten.blum@...ux.dev,
wangjinchao600@...il.com,
yangyicong@...ilicon.com,
yuanql9@...natelecom.cn,
zhangjn11@...natelecom.cn
Subject: Re: [PATCH v4] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race
Hi Doug,
Thanks for your detailed feedback and for the patient explanation regarding the
mainline workqueue behavior.
On Tue, 27 Jan 2026 13:37:28 Doug Anderson <dianders@...omium.org> wrote:
> Really, it matters what schedule_work() does on anyone who happens to have
> commit 930d8f8dbab9 ("watchdog/perf: adapt the watchdog_perf interface
> for async model")... we have to focus on supporting the mainline kernel here.
I completely agree that the focus must be on the mainline kernel. I've since
checked and confirmed that in mainline, schedule_work() is redirected to
system_percpu_wq (via include/linux/workqueue.h), which provides the
necessary CPU affinity.
> To ask directly: have you seen this WARN_ON in mainline, or is this
> all speculative?
To be direct: no, I haven't seen this WARN_ON on a pure mainline kernel. As
you suspected, the issue was identified in a downstream 4.19-based kernel
with different initialization timings and workqueue behavior. My assumption
that it would also affect mainline was indeed speculative and based on an
incomplete understanding of include/linux/sched.h's is_percpu_thread()
implementation on modern kernels.
> I'm still not convinced that there was ever a UAF in mainline nor that
> this actually "Fixes" anything in mainline. I do agree that the code
> is better by not having it write the per-cpu variable at probe time
Since the risk is not currently manifested in mainline, I have refactored
the patch as a "cleanup and robustness improvement" as you suggested. This
removes the fragile implicit dependency on the caller's context and makes
the probe stateless.
I have sent v6 with these changes. Please ignore v5 and review v6 instead.
v6 changes:
- Renamed the title to "simplify perf event probe and remove per-cpu dependency".
- Removed the "Fixes:" tag and "Cc: stable".
- Rewrote the commit record in the imperative mood.
- Updated the description to clarify that it addresses code brittleness
rather than a confirmed mainline bug.
v6 link: https://lore.kernel.org/all/20260127025814.1200345-1-realwujing@gmail.com/
Best regards,
Qiliang
Powered by blists - more mailing lists