lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01b4d228-9416-43f8-a62e-124b92e8741a@paulmck-laptop>
Date: Sun, 10 Mar 2024 12:43:40 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: "Joel Fernandes (Google)" <joel@...lfernandes.org>
Cc: linux-kernel@...r.kernel.org, frederic@...nel.org, boqun.feng@...il.com,
	urezki@...il.com, neeraj.iitr10@...il.com, rcu@...r.kernel.org,
	rostedt@...dmis.org, Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
	Josh Triplett <josh@...htriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Zqiang <qiang.zhang1211@...il.com>
Subject: Re: [PATCH v2 rcu/dev 2/2] rcu/tree: Add comments explaining
 now-offline-CPU QS reports

On Fri, Mar 08, 2024 at 05:44:38PM -0500, Joel Fernandes (Google) wrote:
> This a confusing piece of code (rightfully so as the issue it deals with
> is complex). Recent discussions brought up a question -- what prevents the
> rcu_implicit_dyntick_qs() from warning about QS reports for offline
> CPUs.
> 
> QS reporting for now-offline CPUs should only happen from:
> - gp_init()
> - rcutree_cpu_report_dead()
> 
> Add some comments to this code explaining how QS reporting is not
> missed when these functions are concurrently running.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>

Thank you for putting this together!

A couple of questions below.

							Thanx, Paul

> ---
>  kernel/rcu/tree.c | 36 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index bd29fe3c76bf..f3582f843a05 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1917,7 +1917,22 @@ static noinline_for_stack bool rcu_gp_init(void)

Would it make sense to tag the earlier arch_spin_lock(&rcu_state.ofl_lock)
as preventing grace period from starting concurrently with
rcutree_report_cpu_dead()?

>  		trace_rcu_grace_period_init(rcu_state.name, rnp->gp_seq,
>  					    rnp->level, rnp->grplo,
>  					    rnp->grphi, rnp->qsmask);
> -		/* Quiescent states for tasks on any now-offline CPUs. */
> +		/*
> +		 * === Quiescent states for tasks on any now-offline CPUs. ===
> +		 *
> +		 * QS reporting for now-offline CPUs should only be performed from
> +		 * either here, i.e., gp_init() or from rcutree_report_cpu_dead().
> +		 *
> +		 * Note that, when reporting quiescent states for now-offline CPUs,
> +		 * the sequence of code doing those reports while also accessing
> +		 * ->qsmask and ->qsmaskinitnext, has to be an atomic sequence so
> +		 * that QS reporting is not missed! Otherwise it possible that
> +		 * rcu_implicit_dyntick_qs() screams. This is ensured by keeping
> +		 * the rnp->lock acquired throughout these QS-reporting
> +		 * sequences, which is also acquired in
> +		 * rcutree_report_cpu_dead(), so, acquiring ofl_lock is not
> +		 * necessary here to synchronize with that function.
> +		 */

Would it be better to put the long-form description in the "Hotplug
CPU" section of Documentation/RCU/Design/Requirements/Requirements.rst?
I will be the first to admit that this section is not as detailed as it
needs to be.  This section is already referenced by the block comment
preceding the WARN_ON_ONCE() in rcu_implicit_dyntick_qs(), which is
where people will look first if any of this gets messed up.

Then these other places can refer to that comment or to that section of
Requirements.rst, allowing them to focus on the corresponding piece of
the puzzle.

>  		mask = rnp->qsmask & ~rnp->qsmaskinitnext;
>  		rnp->rcu_gp_init_mask = mask;
>  		if ((mask || rnp->wait_blkd_tasks) && rcu_is_leaf_node(rnp))
> @@ -5116,6 +5131,25 @@ void rcutree_report_cpu_dead(void)
>  	raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
>  	rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq);
>  	rdp->rcu_ofl_gp_state = READ_ONCE(rcu_state.gp_state);
> +
> +	/*
> +	 * === Quiescent state reporting for now-offline CPUs ===
> +	 *
> +	 * QS reporting for now-offline CPUs should only be performed from
> +	 * either here, i.e. rcutree_report_cpu_dead(), or gp_init().
> +	 *
> +	 * Note that, when reporting quiescent states for now-offline CPUs,
> +	 * the sequence of code doing those reports while also accessing
> +	 * ->qsmask and ->qsmaskinitnext, has to be an atomic sequence so
> +	 * that QS reporting is not missed! Otherwise it possible that
> +	 * rcu_implicit_dyntick_qs() screams. This is ensured by keeping
> +	 * the rnp->lock acquired throughout these QS-reporting sequences, which
> +	 * is also acquired in gp_init().
> +	 * One slight change to this rule is below, where we release and
> +	 * reacquire the lock after a QS report, but before we clear the
> +	 * ->qsmaskinitnext bit. That is OK to do, because gp_init() report a
> +	 * QS again, if it acquired the rnp->lock before we reacquired below.
> +	 */

And then this need only say what is happening right here, but possibly
moved to within the following "if" statement, at which point we know that
we are in a grace period that cannot end until we report the quiescent
state (which releases the rcu_node structure's ->lock) and a new grace
period cannot look at this rcu_node structure's ->qsmaskinitnext until
we release rcu_state.ofl_lock.

Thoughts?

>  	if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */
>  		/* Report quiescent state -before- changing ->qsmaskinitnext! */
>  		rcu_disable_urgency_upon_qs(rdp);
> -- 
> 2.34.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ