linux-kernel - Re: [PATCH RFC] sched: Make wake_up_nohz

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20160702001506.GZ4650@linux.vnet.ibm.com>
Date:	Fri, 1 Jul 2016 17:15:06 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	peterz@...radead.org, tglx@...utronix.de,
	linux-kernel@...r.kernel.org, rgkernel@...il.com
Subject: Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going
 offline

On Sat, Jul 02, 2016 at 01:49:56AM +0200, Frederic Weisbecker wrote:
> On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote:
> > > > +/*
> > > > + * Wake up the specified CPU.  If the CPU is going offline, it is the
> > > > + * caller's responsibility to deal with the lost wakeup, for example,
> > > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do.
> > > > + */
> > > >  void wake_up_nohz_cpu(int cpu)
> > > >  {
> > > > -	if (!wake_up_full_nohz_cpu(cpu))
> > > > +	if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu))
> > > 
> > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains
> > > anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't return it,
> > > unless smp_processor_id() is the only alternative.
> > 
> > Right, but the timers have been posted long before even CPU_UP_PREPARE.
> > From what I can see, they are left alone until CPU_DEAD.  Which means
> > that if you try to mod_timer() them between CPU_DYING and CPU_DEAD,
> > you can get the above splat.
> > 
> > Or am I missing somthing subtle here?
> 
> Yes that's exactly what I meant. It happens on mod_timer() calls
> between CPU_DYING and CPU_DEAD. I just wanted to clarify the
> conditions for it to happen: the fact that it shouldn't concern
> remote CPU targets, only local pinned timers.

OK.  What happens in the following sequence of events?

o	CPU 5 posts a timer, which might well be locally pinned.
	This is rcu_torture_reader() posting its on-stack timer
	creatively named "t".

o	CPU 5 starts going offline, so that rcu_torture_reader() gets
	migrated to CPU 6.

o	CPU 5 reaches CPU_DYING but has not yet reached CPU_DEAD.

o	CPU 6 invokes mod_timer() on its timer "t".

Wouldn't that trigger the scenario that I am seeing?

> > > Hence, that call to wake_up_nohz_cpu() can only happen to online CPUs or the current
> > > one (pinned). And wake_up_idle_cpu() on the current CPU is a no-op. So only
> > > wake_up_full_nohz_cpu() is concerned. Then perhaps it would be better to move that
> > > cpu_online() check to wake_up_full_nohz_cpu() ?
> > 
> > As in the patch shown below?  Either way works for me.
> 
> Hmm, the patch doesn't seem to be different than the previous one :-)

Indeed it does not!  How about the one shown below this time?

> > > BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it expected that
> > > it queues timers at this stage?
> > 
> > Hmmm...  From what I can see, rcutorture cleans up its priority-boost
> > kthreads at CPU_DOWN_PREPARE time.  The other threads are allowed to
> > migrate wherever the scheduler wants, give or take the task shuffling.
> > The task shuffling only excludes one CPU at a time, and I have seen
> > this occur when multiple CPUs were running, e.g., 0, 2, and 3 while
> > offlining 1.
> 
> But if rcutorture kthreads are cleaned up at CPU_DOWN_PREPARE, they
> shouldn't be calling mod_timer() on CPU_DYING time. Or there are other
> rcutorture threads?

The rcu_torture_reader() kthreads aren't associated with any particular
CPU, so when CPUs go offline, they just get migrated to other CPUs.
This allows them to execute on those other CPUs between CPU_DYING and
CPU_DEAD time, correct?

Other rcutorture kthreads -are- bound to specific CPUs, but they are
testing priority boosting, not simple reading.

> > Besides which, doesn't the scheduler prevent anything but the idle
> > thread from running after CPU_DYING time?
> 
> Indeed migrate_tasks() is called on CPU_DYING but pinned kthreads, outside
> smpboot, have their own way to deal with hotplug through notifiers.

Agreed, but the rcu_torture_reader() kthreads aren't pinned, so they
should migrate automatically at CPU_DYING time.

 							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7f2cae4620c7..1a91fc733a0f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu)
 	 * If needed we can still optimize that later with an
 	 * empty IRQ.
 	 */
+	if (cpu_is_offline(cpu))
+		return true;
 	if (tick_nohz_full_cpu(cpu)) {
 		if (cpu != smp_processor_id() ||
 		    tick_nohz_tick_stopped())
@@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu)
 	return false;
 }
 
+/*
+ * Wake up the specified CPU.  If the CPU is going offline, it is the
+ * caller's responsibility to deal with the lost wakeup, for example,
+ * by hooking into the CPU_DEAD notifier like timers and hrtimers do.
+ */
 void wake_up_nohz_cpu(int cpu)
 {
 	if (!wake_up_full_nohz_cpu(cpu))