[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260106070723.2313045-1-wangqing7171@gmail.com>
Date: Tue, 6 Jan 2026 15:07:22 +0800
From: Qing Wang <wangqing7171@...il.com>
To: akpm@...ux-foundation.org
Cc: Liam.Howlett@...cle.com,
brauner@...nel.org,
bsegall@...gle.com,
david@...nel.org,
dietmar.eggemann@....com,
jack@...e.cz,
joel.granados@...nel.org,
juri.lelli@...hat.com,
keescook@...mium.org,
linux-kernel@...r.kernel.org,
lorenzo.stoakes@...cle.com,
mingo@...hat.com,
mjguzik@...il.com,
oleg@...hat.com,
peterz@...radead.org,
rostedt@...dmis.org,
rppt@...nel.org,
syzbot+e0378d4f4fe57aa2bdd0@...kaller.appspotmail.com,
vbabka@...e.cz,
vincent.guittot@...aro.org,
wangqing7171@...il.com
Subject: Re: [PATCH] fork/pid: Fix use-after-free in __task_pid_nr_ns
> It might be helpful to have a comment here telling readers how
> task->signal can be zero.
>
> Also, what in here prevents task->signal from being zeroed after we've
> tested it and before we dereference it?
Thank you for your feedback. Regarding the "test-and-use" race condition
you raised, I’ve thought about it extensively but haven’t found a
better solution on the access side.
However, after re-examining the issue, I guess the root cause lies in
the copy_process() flow itself, and we may not need complex handling at
the access site:
1. The signal_struct is not fully managed by reference counting: In
the normal (successful) path of copy_process(), the signal structure is
indeed reference-counted, and its lifetime should be at least longer than
the task’s. However, in the failure/cleanup path, signal is explicitly
freed via free_signal_struct(), which prematurely ends its lifetime. At
the same time, other subsystems (e.g., perf) might still hold references
and attempt to access it—even if such access may be questionable.
2. A newly created task should not be visible to other CPUs during
creation: The perf subsystem copies the parent’s events
to the child during copy_process(). Later, when the parent closes or
manipulates its own perf event, it may traverse child events and access
child_ctx->task->signal. This means that a child process that has not
yet been fully created can be referenced by other CPUs.
Based on this analysis, I propose two possible fixes—either one should
resolve the issue:
1. Remove the explicit free_signal() in the cleanup path, and
fully managed by reference counting for signal lifetime. Currently
put_signal_struct() is only used in __put_task_struct(), so the lifetime
of signal is longer than or equal to task.
2. Defer perf_event_init_task() until after copy_signal() succeeds,
ensuring that if copy_process() failed perf events will be cleaned
up before the signal. This guarantees that no perf event can access
the signal.
I believe either approach would eliminate the issue. Could you please
review whether this analysis and the proposed solutions are correct? Any
guidance would be greatly appreciated.
Powered by blists - more mailing lists