lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 7 Nov 2013 13:59:26 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Mike Galbraith <bitbucket@...ine.de>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	RT <linux-rt-users@...r.kernel.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: CONFIG_NO_HZ_FULL + CONFIG_PREEMPT_RT_FULL = nogo

On Thu, Nov 07, 2013 at 12:21:11PM +0100, Thomas Gleixner wrote:
> Mike,
> 
> On Thu, 7 Nov 2013, Mike Galbraith wrote:
> 
> > On Thu, 2013-11-07 at 04:26 +0100, Mike Galbraith wrote: 
> > > On Wed, 2013-11-06 at 18:49 +0100, Thomas Gleixner wrote: 
> > 
> > > > I bet you are trying to work around some of the side effects of the
> > > > occasional tick which is still necessary despite of full nohz, right?
> > > 
> > > Nope, I wanted to check out cost of nohz_full for rt, and found that it
> > > doesn't work at all instead, looked, and found that the sole running
> > > task has just awakened ksoftirqd when it wants to shut the tick down, so
> > > that shutdown never happens. 
> > 
> > Like so in virgin 3.10-rt.  Box is x3550 M3 booted nowatchdog
> > rcu_nocbs=1-3 nohz_full=1-3, and CPUs1-3 are completely isolated via
> > cpusets as well.
> 
> well, that very same problem is in mainline if you add "threadirqs" to
> the command line. But we can be smart about this. The untested patch
> below should address that issue. If that works on mainline we can
> adapt it for RT (needs a trylock(&base->lock) there).
> 
> Though it's not a full solution. It needs some thought versus the
> softirq code of timers. Assume we have only one timer queued 1000
> ticks into the future. So this change will cause the timer softirq not
> to be called until that timer expires and then the timer softirq is
> going to do 1000 loops until it catches up with jiffies. That's
> anything but pretty ...
> 
> What worries me more is this one:
> 
>   pert-5229  [003] d..h1..   684.482618: softirq_raise: vec=9 [action=RCU]
> 
> The CPU has no callbacks as you shoved them over to cpu 0, so why is
> the RCU softirq raised?

I see, so the problem is that we raise the timer softirq unconditionally
from the tick?

Ok we definetly don't want to keep that behaviour, even if softirqs are not
threaded, that's an overhead. So I'm looking at that loop in __run_timers()
and I guess you mean the "base->timer_jiffies" incrementation?

That's indeed not pretty. How do we handle exit from long dynticks idle periods? Are we
doing that loop until we catch up with the new jiffies?

Then it relies on the timer cascade stuff which is very obscure code to me...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ