lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 07 Nov 2013 14:07:14 +0100
From:	Mike Galbraith <bitbucket@...ine.de>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	RT <linux-rt-users@...r.kernel.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: CONFIG_NO_HZ_FULL + CONFIG_PREEMPT_RT_FULL = nogo

On Thu, 2013-11-07 at 12:21 +0100, Thomas Gleixner wrote: 
> Mike,
> 
> On Thu, 7 Nov 2013, Mike Galbraith wrote:
> 
> > On Thu, 2013-11-07 at 04:26 +0100, Mike Galbraith wrote: 
> > > On Wed, 2013-11-06 at 18:49 +0100, Thomas Gleixner wrote: 
> > 
> > > > I bet you are trying to work around some of the side effects of the
> > > > occasional tick which is still necessary despite of full nohz, right?
> > > 
> > > Nope, I wanted to check out cost of nohz_full for rt, and found that it
> > > doesn't work at all instead, looked, and found that the sole running
> > > task has just awakened ksoftirqd when it wants to shut the tick down, so
> > > that shutdown never happens. 
> > 
> > Like so in virgin 3.10-rt.  Box is x3550 M3 booted nowatchdog
> > rcu_nocbs=1-3 nohz_full=1-3, and CPUs1-3 are completely isolated via
> > cpusets as well.
> 
> well, that very same problem is in mainline if you add "threadirqs" to
> the command line. But we can be smart about this. The untested patch
> below should address that issue. If that works on mainline we can
> adapt it for RT (needs a trylock(&base->lock) there).

Oops, in haste I wedged it straight into 3.10-rt as is.  First pert
attempt was a bit weird, but it eventually worked.

rtbox:/sys/kernel/debug/tracing # !cgexec
cgexec -g cpuset:rtcpus taskset -c 3 pert 5
2400.01 MHZ CPU
perturbation threshold 0.018 usecs.
pert/s:      807 >8.52us:        2 min:  0.04 max: 10.80 avg:  5.56 sum/s:  4485us overhead: 0.45%
pert/s:      707 >8.54us:        4 min:  2.85 max: 11.78 avg:  5.63 sum/s:  3981us overhead: 0.40%
pert/s:      807 >8.51us:        2 min:  0.04 max: 10.86 avg:  5.58 sum/s:  4502us overhead: 0.45%
pert/s:      707 >8.48us:        3 min:  0.04 max: 10.82 avg:  5.59 sum/s:  3959us overhead: 0.40%
pert/s:      630 >8.73us:        5 min:  0.04 max: 16.65 avg:  5.29 sum/s:  3335us overhead: 0.33%
pert/s:      152 >9.50us:        4 min:  0.04 max: 32.58 avg:  0.37 sum/s:    56us overhead: 0.01%
pert/s:       28 >9.74us:        3 min:  0.04 max: 22.31 avg:  1.41 sum/s:    40us overhead: 0.00%
pert/s:        8 >10.02us:        4 min:  1.75 max: 20.56 avg:  4.54 sum/s:    36us overhead: 0.00%
pert/s:        7 >10.23us:        3 min:  1.82 max: 19.94 avg:  4.33 sum/s:    34us overhead: 0.00%
pert/s:        9 >10.45us:        5 min:  0.04 max: 20.79 avg:  4.11 sum/s:    38us overhead: 0.00%
pert/s:       31 >10.57us:        5 min:  0.04 max: 22.13 avg:  1.22 sum/s:    38us overhead: 0.00%
pert/s:       10 >10.77us:        5 min:  0.04 max: 21.40 avg:  3.68 sum/s:    38us overhead: 0.00%
^C
rtbox:/sys/kernel/debug/tracing # cgexec -g cpuset:rtcpus taskset -c 3 pert 5
2400.02 MHZ CPU
perturbation threshold 0.018 usecs.
pert/s:        8 >14.06us:        2 min:  1.70 max: 19.66 avg:  4.24 sum/s:    35us overhead: 0.00%
pert/s:        8 >13.97us:        3 min:  1.80 max: 21.81 avg:  4.48 sum/s:    37us overhead: 0.00%
pert/s:        8 >13.77us:        2 min:  1.77 max: 19.64 avg:  4.35 sum/s:    35us overhead: 0.00%
pert/s:        9 >13.72us:        3 min:  0.04 max: 22.03 avg:  4.35 sum/s:    39us overhead: 0.00%
pert/s:        8 >13.55us:        2 min:  1.75 max: 19.88 avg:  4.16 sum/s:    35us overhead: 0.00%
pert/s:        8 >13.43us:        3 min:  0.04 max: 20.55 avg:  4.21 sum/s:    36us overhead: 0.00%
pert/s:        8 >13.28us:        2 min:  1.74 max: 19.53 avg:  4.34 sum/s:    35us overhead: 0.00%
pert/s:        8 >13.22us:        3 min:  1.76 max: 20.96 avg:  4.35 sum/s:    37us overhead: 0.00%
pert/s:        8 >13.10us:        2 min:  1.72 max: 19.64 avg:  4.38 sum/s:    36us overhead: 0.00%
^C
rtbox:/sys/kernel/debug/tracing # cgexec -g cpuset:rtcpus taskset -c 3 pert 5
2400.03 MHZ CPU
perturbation threshold 0.018 usecs.
pert/s:        9 >14.55us:        2 min:  0.04 max: 20.93 avg:  4.11 sum/s:    37us overhead: 0.00%
pert/s:        8 >14.36us:        3 min:  1.72 max: 20.75 avg:  4.42 sum/s:    36us overhead: 0.00%
pert/s:        8 >14.14us:        2 min:  1.74 max: 20.02 avg:  4.28 sum/s:    35us overhead: 0.00%
pert/s:        8 >13.98us:        3 min:  1.77 max: 20.54 avg:  4.51 sum/s:    36us overhead: 0.00%
pert/s:        8 >13.76us:        2 min:  1.72 max: 19.57 avg:  4.17 sum/s:    35us overhead: 0.00%
pert/s:        8 >13.63us:        3 min:  1.79 max: 20.42 avg:  4.38 sum/s:    36us overhead: 0.00%
pert/s:        9 >13.51us:        2 min:  0.04 max: 20.78 avg:  4.09 sum/s:    37us overhead: 0.00%

> What worries me more is this one:
> 
>   pert-5229  [003] d..h1..   684.482618: softirq_raise: vec=9 [action=RCU]
> 
> The CPU has no callbacks as you shoved them over to cpu 0, so why is
> the RCU softirq raised?

Dunno, but it's repeatable.  Workqueues are perturbation sources too,
update_vmstat, drain_caches (or such, didn't save all traces).

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ