lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 16 Nov 2015 17:41:03 -0800
From:	Josh Triplett <josh@...htriplett.org>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Jacob Pan <jacob.jun.pan@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	John Stultz <john.stultz@...aro.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
	Len Brown <len.brown@...el.com>,
	Rafael Wysocki <rafael.j.wysocki@...el.com>,
	Eduardo Valentin <edubezval@...il.com>,
	Paul Turner <pjt@...gle.com>
Subject: Re: [PATCH 2/4] timer: relax tick stop in idle entry

On Mon, Nov 16, 2015 at 03:26:40PM -0800, Paul E. McKenney wrote:
> On Mon, Nov 16, 2015 at 02:32:11PM -0800, Josh Triplett wrote:
> > On Mon, Nov 16, 2015 at 01:51:26PM -0800, Jacob Pan wrote:
> > > On Mon, 16 Nov 2015 16:06:57 +0100 (CET)
> > > Thomas Gleixner <tglx@...utronix.de> wrote:
> > > 
> > > > >           <idle>-0     [000]    30.093474: bprint:
> > > > > __tick_nohz_idle_enter: JPAN: tick_nohz_stop_sched_tick 609 delta
> > > > > 1000000 [JP] but sees delta is exactly 1 tick away. didn't stop
> > > > > tick.  
> > > > 
> > > > If the delta is 1 tick then it is not supposed to stop it. Did you
> > > > ever try to figure out WHY it is 1 tick?
> > > > 
> > > > There are two code pathes which can set it to basemono + TICK_NSEC:
> > > > 
> > > >         if (rcu_needs_cpu(basemono, &next_rcu) ||
> > > >             arch_needs_cpu() || irq_work_needs_cpu()) {
> > > >                 next_tick = basemono + TICK_NSEC;
> > > >         } else {
> > > >                 next_tmr = get_next_timer_interrupt(basejiff,
> > > > basemono); ts->next_timer = next_tmr;
> > > >                 /* Take the next rcu event into account */
> > > >                 next_tick = next_rcu < next_tmr ? next_rcu : next_tmr;
> > > >         }
> > > > 
> > > > Can you please figure out WHY the tick is requested to continue
> > > > instead of blindly wreckaging the logic in that code?
> > > 
> > > Looks like the it hits in both cases during forced idle.
> > > + Josh
> > > + Paul
> > > 
> > > For the first case, it is always related to RCU. I found there are two
> > > CONFIG options to avoid this undesired tick in idle loop.
> > > 1. enable CONFIG_RCU_NOCB_CPU_ALL, offload to orcu kthreads
> > > 2. or enable CONFIG_RCU_FAST_NO_HZ (enter dytick idle w/ rcu callback)
> > > 
> > > Either one works but my concern is that users may not realize the
> > > intricate CONFIG_ options and how they translate into energy savings.
> > > Consulted with Josh, it seems we could add a check here to recognize
> > > the forced idle state and relax rcu_needs_cpu() to return false even it
> > > has callbacks. Since we are blocking everybody for a short time (5 ticks
> > > default). It should not impact synchronize and kfree rcu.
> > 
> > Right; as long as you're blocking *everybody*, and RCU priority boosting
> > doesn't come into play (meaning a real-time task is waiting on RCU
> > callbacks), then I don't see any harm in blocking RCU callbacks for a
> > while.  You'd block completion of synchronize_rcu() and similar, as well
> > as memory reclamation, but since you've blocked *every* CPU systemwide
> > then that doesn't cause a problem.
> 
> True enough.  But how does RCU distinguish between this being a
> normal idle cycle that might last indefinitely on the one hand and the
> five-jiffy system-wide throttling on the other?  OK, maybe there is a
> global variable that says that the just-now-starting idle period is
> system-wide throttling.  But then what about the CPU that just went
> idle 10 microseconds ago, and therefore left its timer tick running?
> Fine and well, we could IPI it to wake it up and let it see that we
> are now doing thermal throttling.  But then we presumably also have to
> IPI it at the end of the thermal-throttling interval in order for it to
> re-evaluate whether or not it should have the tick going.  :-/
> 
> On the one hand, I am sure that all of this can be made to work,
> but simply having systems using thermal throttling enable either
> CONFIG_RCU_NOCB_CPU_ALL or CONFIG_RCU_FAST_NO_HZ seems -way- simpler.
> CONFIG_RCU_FAST_NO_HZ is probably the better choice for generic workloads,
> but CONFIG_RCU_NOCB_CPU_ALL is the better choice for embedded workloads
> where it is less likely that RCU callbacks will be posted with continuous
> wild abandon.
> 
> Or am I missing something subtle here?

I agree that it seems preferable to make this require an existing RCU
solution rather than adding more complexity to the RCU idle path.  One
possible thing that may affect the choice of solution: this needs to
idle *every* CPU, without leaving any CPU awake to handle callbacks or
similar.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists