linux-kernel - Re: [Bug #12650] Strange load average and ksoftirqd behavior with 2.6.29-rc2-git1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 17 Feb 2009 17:02:20 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Damien Wyart <damien.wyart@...e.fr>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mike Galbraith <efault@....de>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>
Subject: Re: [Bug #12650] Strange load average and ksoftirqd behavior with
	2.6.29-rc2-git1

On Wed, Feb 18, 2009 at 01:38:01AM +0100, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> 
> > No, it was my confusion -- I later realized that your data 
> > above meant that the force-quiescent-state code path was not 
> > being heavily exercised. So no need for this trace!
> 
> Do you have any theory for why RCU was activated every 100-200 
> microseconds, resulting in 20% ksoftirqd CPU use - and why the 
> problem went away with classic-rcu?

RCU was activated every 100-200 microseconds because the x86 32-bit
idle loop would call rcu_pending() and rcu_check_callbacks() in a tight
loop under some conditions.  This was happening to both classic and
tree RCU, but classic RCU has a more exact rcu_pending() check, and so
classic RCU's rcu_pending() always returns false, so that classic RCU's
rcu_check_callbacks() was never invoked, so that the raise_softirq()
is never called, so that control never passed to ksoftirqd, so that
things like "uptime" could not see the activity.

But the activity was occurring with classic RCU nevertheless.

</useful information>

<aside>

Interestingly enough, this is actually a symptom of a theoretical bug
in classic RCU (noted by Manfred Spraul some months ago).  Classic RCU
assumes that interrupts from dynticks idle mode don't run long enough
to run through a full grace period (which, in absence of truly broken
driver code, they do not).  Therefore, classic RCU removes all dynticks
idle CPUs from considation at the beginning of each grace period, so
that classic RCU's rcu_pending() doesn't have to concern itself with
other dyntick-idle CPUs.

When rcu_pending() is invoked once per jiffy, the additional checking
that tree RCU must do is in the noise, but not so when called repeatedly
from the idle loop.

</aside>

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/