linux-kernel - Re: [RFC v2 4/5] rcu: Use for_each_leaf_node_cpu() in force_qs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20161222010849.GA1728@tardis.cn.ibm.com>
Date:   Thu, 22 Dec 2016 09:08:49 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     Colin Ian King <colin.king@...onical.com>,
        Mark Rutland <mark.rutland@....com>,
        linux-kernel@...r.kernel.org,
        Josh Triplett <josh@...htriplett.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [RFC v2 4/5] rcu: Use for_each_leaf_node_cpu() in force_qs_rnp()

On Wed, Dec 21, 2016 at 08:48:45AM -0800, Paul E. McKenney wrote:
> On Wed, Dec 21, 2016 at 12:18:08PM +0800, Boqun Feng wrote:
> > On Tue, Dec 20, 2016 at 07:40:24PM -0800, Paul E. McKenney wrote:
> > [...]
> > > > 
> > > > Agreed, my intent is to keep this overcare check for couples of releases
> > > > and if no one shoots his/her foot, we can remove it, if not, it
> > > > definitely means this part is subtle, and we need to pay more attention
> > > > to it, maybe write some regression tests for this particular problem to
> > > > help developers avoid it.
> > > > 
> > > > This check is supposed to be removed, so I'm not stick to keeping it.
> > > 
> > > I suggest keeping through validation.  If it triggers during that time,
> > > consider keeping it longer.  If it does not trigger, remove it before
> > > it goes upstream.
> > 
> > Good point ;-)
> > 
> > [...]
> > > > > > 
> > > > > > But this brings a side question, is the callsite of rcu_cpu_starting()
> > > > > > is correct? Given rcu_cpu_starting() ignores the @cpu parameter and only
> > > > > > set _this_ cpu's bit in a leaf node?
> > > > > 
> > > > > The calls from notify_cpu_starting() are called from the various
> > > > > start_kernel_secondary(), secondary_start_kernel(), and similarly
> > > > > named functions.  These are called on the incoming CPU early in that
> > > > > CPU's execution.  The call from rcu_init() is correct until such time
> > > > > as more than one CPU can be running at rcu_init() time.  And that
> > > > > day might be coming, so please see the untested patch below.
> > > > 
> > > > Looks better than mine ;-)
> > > > 
> > > > But do we need to worry that we start rcu on each CPU twice, which may
> > > > slow down the boot?
> > > 
> > > We only start a given CPU once.  The boot CPU at rcu_init() time, and
> > > the rest at CPU-hotplug time.  Unless of course a CPU is later taken
> > 
> > Confused... we call rcu_cpu_starting() in a for_each_online_cpu() loop
> > in rcu_init(), so we basically start all online CPUs there after
> > applying your patch. And all the rest CPUs will get themselves start
> > again at CPU-hotplug time, right?
> 
> At rcu_init() time, there is only one online CPU, namely the boot CPU.
> 
> Or perhaps your point is that if CPUs come online before rcu_init(), they
> might do so via the normal online mechanism.  I don't believe that this
> is likely, because the normal online mechanism reaquires the scheduler
> be running.  But either way, my hope would be that whoever fires up CPUs
> before rcu_init() asks a few questions when they run into bugs.  ;-)
> 

;-)

> > Besides, without your patch, we started the boot CPU many times in the
> > for_each_online_cpu() loop.
> 
> That is true.  It is harmless because it just does a group of assignments
> repeatedly, and because there is only one CPU and because interrupts
> are disabled, this cannot have any effect.  And my fix inadvertently
> fixed this issue, didn't it?
> 

Yep!

> So I do need to update the commit log accordingly.  Done!
> 
> > Am I missing something subtle?
> 
> Given the nature of RCU, the only possible answer I can give to that
> question is "probably".  (Hey, you asked!!!)
> 

True, I misread the for_each_online_cpu() loop in rcu_init(), I thought
at that time, CPUs other than boot cpu have already mask themselves in
the cpu_online_mask. But that's not true.. 

Sorry for the noice, and thank you for explanation ;-)

Regards,
Boqun

> 							Thanx, Paul
> 
> > Regards,
> > Boqun
> > 
> > > offline, in which case we start it again when it comes back online.
> > > 
> > > 							Thanx, Paul
> > > 
> > > > Regards,
> > > > Boqun
> > > > 
> > > > > 							Thanx, Paul
> > > > > 
> > > > > ------------------------------------------------------------------------
> > > > > 
> > > > > commit 1e84402587173d6d4da8645689f0e24c877b3269
> > > > > Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > > > > Date:   Tue Dec 20 07:17:58 2016 -0800
> > > > > 
> > > > >     rcu: Make rcu_cpu_starting() use its "cpu" argument
> > > > >     
> > > > >     The rcu_cpu_starting() function uses this_cpu_ptr() to locate the
> > > > >     incoming CPU's rcu_data structure.  This works for the boot CPU and for
> > > > >     all CPUs onlined after rcu_init() executes (during very early boot).
> > > > >     Currently, this is the full set of CPUs, so all is well.  But if
> > > > >     anyone ever parallelizes boot before rcu_init() time, it will fail.
> > > > >     This commit therefore substitutes the rcu_cpu_starting() function's
> > > > >     this_cpu_pointer() for per_cpu_ptr(), future-proofing the code and
> > > > >     (arguably) improving readability.
> > > > >     
> > > > >     Reported-by: Boqun Feng <boqun.feng@...il.com>
> > > > >     Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > > > > 
> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > index b9d3c0e30935..083cb8a6299c 100644
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -4017,7 +4017,7 @@ void rcu_cpu_starting(unsigned int cpu)
> > > > >  	struct rcu_state *rsp;
> > > > >  
> > > > >  	for_each_rcu_flavor(rsp) {
> > > > > -		rdp = this_cpu_ptr(rsp->rda);
> > > > > +		rdp = per_cpu_ptr(rsp->rda, cpu);
> > > > >  		rnp = rdp->mynode;
> > > > >  		mask = rdp->grpmask;
> > > > >  		raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > > > 
> > > 
> > > 
> 
> 

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)