[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <23f488d7-d369-46f2-8da6-c5fd2af0f9d7@nvidia.com>
Date: Fri, 9 May 2025 15:07:30 -0400
From: Joel Fernandes <joelagnelf@...dia.com>
To: Zqiang <qiang.zhang1211@...il.com>, paulmck@...nel.org,
frederic@...nel.org, neeraj.upadhyay@...nel.org, joel@...lfernandes.org,
urezki@...il.com, boqun.feng@...il.com
Cc: rcu@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rcu/nocb: Fix possible invalid rdp's->nocb_cb_kthread
pointer access
On 5/7/2025 7:26 AM, Zqiang wrote:
> In the preparation stage of CPU online, if the corresponding
> the rdp's->nocb_cb_kthread does not exist, will be created,
> there is a situation where the rdp's rcuop kthreads creation fails,
> and then de-offload this CPU's rdp, does not assign this CPU's
> rdp->nocb_cb_kthread pointer, but this rdp's->nocb_gp_rdp and
> rdp's->rdp_gp->nocb_gp_kthread is still valid.
>
> This will cause the subsequent re-offload operation of this offline
> CPU, which will pass the conditional check and the kthread_unpark()
> will access invalid rdp's->nocb_cb_kthread pointer.
>
> This commit therefore use rdp's->nocb_gp_kthread instead of
> rdp_gp's->nocb_gp_kthread for safety check.
>
> Signed-off-by: Zqiang <qiang.zhang1211@...il.com>
Is it possible that rdp_gp->nocb_gp_kthread is valid, but rdp->nocb_cb_kthread
is still invalid (because its creation failed?). This seems a bit fragile to me
to assume that if rdp_gp->nocb_gp_kthread then rdp->nocb_cb_kthread is valid. Or
is there a path that makes sure this invariant is always satisfied? If so, can
we add additional warnings for checking this invariant?
Also from the other thread, it sounds like there is more work to do here
(related patches so I'd like to defer this to 6.17 - feel free to keep posting
patches for this work though).
Thanks!
- Joel
> ---
> kernel/rcu/tree_nocb.h | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 1596812f7f12..6679140bb0b5 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -1146,7 +1146,6 @@ static bool rcu_nocb_rdp_offload_wait_cond(struct rcu_data *rdp)
> static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
> {
> int wake_gp;
> - struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
>
> WARN_ON_ONCE(cpu_online(rdp->cpu));
> /*
> @@ -1156,7 +1155,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
> if (!rdp->nocb_gp_rdp)
> return -EINVAL;
>
> - if (WARN_ON_ONCE(!rdp_gp->nocb_gp_kthread))
> + if (WARN_ON_ONCE(!rdp->nocb_gp_kthread))
> return -EINVAL;
>
> pr_info("Offloading %d\n", rdp->cpu);
> @@ -1166,7 +1165,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
>
> wake_gp = rcu_nocb_queue_toggle_rdp(rdp);
> if (wake_gp)
> - wake_up_process(rdp_gp->nocb_gp_kthread);
> + wake_up_process(rdp->nocb_gp_kthread);
>
> swait_event_exclusive(rdp->nocb_state_wq,
> rcu_nocb_rdp_offload_wait_cond(rdp));
Powered by blists - more mailing lists