linux-kernel - Re: [PATCH RFC] sched: Make wake_up_nohz

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20160712144113.GA8646@linux.vnet.ibm.com>
Date:	Tue, 12 Jul 2016 07:41:13 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	peterz@...radead.org, tglx@...utronix.de,
	linux-kernel@...r.kernel.org, rgkernel@...il.com
Subject: Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going
 offline

On Mon, Jul 04, 2016 at 09:55:34AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 04, 2016 at 02:21:43PM +0200, Frederic Weisbecker wrote:
> > On Fri, Jul 01, 2016 at 05:15:06PM -0700, Paul E. McKenney wrote:
> > > On Sat, Jul 02, 2016 at 01:49:56AM +0200, Frederic Weisbecker wrote:
> > > > On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote:

[ . . . ]

> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 7f2cae4620c7..1a91fc733a0f 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu)
> > >  	 * If needed we can still optimize that later with an
> > >  	 * empty IRQ.
> > >  	 */
> > > +	if (cpu_is_offline(cpu))
> > > +		return true;
> > 
> > Preferably put this under the tick_nohz_full_cpu() below because
> > it has a static key optimizations. Distros build NO_HZ_FULL but
> > don't use it 99.99999% of the time.
> 
> Except that I need to need to return "true" if the CPU is offline even
> if it is not a tick_nohz_full_cpu(), don't I?
> 
> In fact, the stack trace shows native_smp_send_reschedule(), which leads
> me to believe that the splat happened via the "return false" at the bottom
> of wake_up_full_nohz_cpu().  So I don't see a way to completely hide
> the check under tick_nohz_full_cpu().
> 
> Or am I missing something here?
> 
> > >  	if (tick_nohz_full_cpu(cpu)) {
> > >  		if (cpu != smp_processor_id() ||
> > >  		    tick_nohz_tick_stopped())
> > > @@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu)
> > >  	return false;
> > >  }
> > >  
> > > +/*
> > > + * Wake up the specified CPU.  If the CPU is going offline, it is the
> > > + * caller's responsibility to deal with the lost wakeup, for example,
> > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do.
> > > + */
> > 
> > I think it's more transparent than that for the caller. migrate_timers() is
> > called soon after and it takes care of waking up the destination of the migration
> > if necessary. So the caller shouldn't care after all.
> 
> For the current two callers, agreed.  But is this function designed to
> be only used by timers and hrtimers?  If so, shouldn't there be some
> kind of comment to that effect?  If not, shouldn't people considering
> use of this function be warned that it is not unconditional?
> 
> > But the cpu_is_offline() check above may need a comment about that.
> 
> Like this?
> 
> 	if (cpu_is_offline(cpu))
> 		return true;  /* Don't try to wake offline CPUs. */

Hearing no objections, I am proceeding as shown below.

							Thanx, Paul

------------------------------------------------------------------------

commit b55692986a87f0cea92ddfebe7181598f774c2c2
Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Date:   Thu Jun 30 10:37:20 2016 -0700

    sched: Make wake_up_nohz_cpu() handle CPUs going offline
    
    Both timers and hrtimers are maintained on the outgoing CPU until
    CPU_DEAD time, at which point they are migrated to a surviving CPU.  If a
    mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems
    will splat in native_smp_send_reschedule() when attempting to wake up
    the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel:
    
    [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40
    [ 7976.741595] Modules linked in:
    [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1
    [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    [ 7976.741595]  0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000
    [ 7976.741595]  0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18
    [ 7976.741595]  0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00
    [ 7976.741595] Call Trace:
    [ 7976.741595]  [<ffffffff8138ab2e>] dump_stack+0x67/0x99
    [ 7976.741595]  [<ffffffff8105cabc>] __warn+0xcc/0xf0
    [ 7976.741595]  [<ffffffff8105cb98>] warn_slowpath_null+0x18/0x20
    [ 7976.741595]  [<ffffffff8103cba9>] native_smp_send_reschedule+0x39/0x40
    [ 7976.741595]  [<ffffffff81089bc2>] wake_up_nohz_cpu+0x82/0x190
    [ 7976.741595]  [<ffffffff810d275a>] internal_add_timer+0x7a/0x80
    [ 7976.741595]  [<ffffffff810d3ee7>] mod_timer+0x187/0x2b0
    [ 7976.741595]  [<ffffffff810c89dd>] rcu_torture_reader+0x33d/0x380
    [ 7976.741595]  [<ffffffff810c66f0>] ? sched_torture_read_unlock+0x30/0x30
    [ 7976.741595]  [<ffffffff810c86a0>] ? rcu_bh_torture_read_lock+0x80/0x80
    [ 7976.741595]  [<ffffffff8108068f>] kthread+0xdf/0x100
    [ 7976.741595]  [<ffffffff819dd83f>] ret_from_fork+0x1f/0x40
    [ 7976.741595]  [<ffffffff810805b0>] ? kthread_create_on_node+0x200/0x200
    
    However, in this case, the wakeup is redundant, because the timer
    migration will reprogram timer hardware as needed.  Note that the fact
    that preemption is disabled does not avoid the splat, as the offline
    operation has already passed both the synchronize_sched() and the
    stop_machine() that would be blocked by disabled preemption.
    
    This commit therefore modifies wake_up_nohz_cpu() to avoid attempting
    to wake up offline CPUs.  It also adds a comment stating that the
    caller must tolerate lost wakeups when the target CPU is going offline,
    and suggesting the CPU_DEAD notifier as a recovery mechanism.
    
    Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
    Cc: Peter Zijlstra <peterz@...radead.org>
    Cc: Frederic Weisbecker <fweisbec@...il.com>
    Cc: Thomas Gleixner <tglx@...utronix.de>

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7f2cae4620c7..26c01b2e9881 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu)
 	 * If needed we can still optimize that later with an
 	 * empty IRQ.
 	 */
+	if (cpu_is_offline(cpu))
+		return true;  /* Don't try to wake offline CPUs. */
 	if (tick_nohz_full_cpu(cpu)) {
 		if (cpu != smp_processor_id() ||
 		    tick_nohz_tick_stopped())
@@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu)
 	return false;
 }
 
+/*
+ * Wake up the specified CPU.  If the CPU is going offline, it is the
+ * caller's responsibility to deal with the lost wakeup, for example,
+ * by hooking into the CPU_DEAD notifier like timers and hrtimers do.
+ */
 void wake_up_nohz_cpu(int cpu)
 {
 	if (!wake_up_full_nohz_cpu(cpu))