[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <11e9e09a-c47f-4690-a7c5-9a08913c3e5d@paulmck-laptop>
Date: Fri, 21 Jul 2023 15:37:05 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: thunder.leizhen@...weicloud.com
Cc: Frederic Weisbecker <frederic@...nel.org>,
Neeraj Upadhyay <quic_neeraju@...cinc.com>,
Joel Fernandes <joel@...lfernandes.org>,
Josh Triplett <josh@...htriplett.org>,
Boqun Feng <boqun.feng@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>, rcu@...r.kernel.org,
linux-kernel@...r.kernel.org, Zhen Lei <thunder.leizhen@...wei.com>
Subject: Re: [PATCH 1/2] rcu: Delete a redundant check in check_cpu_stall()
On Fri, Jul 21, 2023 at 03:57:15PM +0800, thunder.leizhen@...weicloud.com wrote:
> From: Zhen Lei <thunder.leizhen@...wei.com>
>
> j = jiffies;
> js = READ_ONCE(rcu_state.jiffies_stall); (1)
> if (ULONG_CMP_LT(j, js)) (2)
> return;
>
> if (cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) (3)
> didstall = true;
>
> if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) { (4)
> jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
> WRITE_ONCE(rcu_state.jiffies_stall, jn);
> }
>
> For ease of description, the pseudo code is extracted as above. First,
> assume that only one CPU is operating, the condition 'didstall' is true
> means that (3) succeeds. That is, the value of rcu_state.jiffies_stall
> must be 'jn'.
>
> Then, assuming that another CPU is also operating at the same time, there
> are two cases:
> 1. That CPU sees the updated result at (1), it returns at (2).
> 2. That CPU does not see the updated result at (1), it fails at (3) and
> cmpxchg returns jn. So that, didstall cannot be true.
The history behind this one is that there are races in which the stall
can end in the midst of check_cpu_stall(). For example, when the activity
of producing the stall warning breaks things loose.
And yes, long ago, I figured that if things had been static for so
many seconds, they were unlikely to change, and thus omitted any and
all synchronization. The Linux kernel taught me better. ;-)
Thanx, Paul
> Signed-off-by: Zhen Lei <thunder.leizhen@...wei.com>
> ---
> kernel/rcu/tree_stall.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index cc884cd49e026a3..371713f3f7d15d9 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -794,7 +794,7 @@ static void check_cpu_stall(struct rcu_data *rdp)
> rcu_ftrace_dump(DUMP_ALL);
> didstall = true;
> }
> - if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) {
> + if (didstall) {
> jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
> WRITE_ONCE(rcu_state.jiffies_stall, jn);
> }
> --
> 2.25.1
>
Powered by blists - more mailing lists