lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101202101658.GA5621@roll>
Date:	Thu, 2 Dec 2010 05:16:58 -0500
From:	tmhikaru@...il.com
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Damien Wyart <damien.wyart@...e.fr>,
	Venkatesh Pallipadi <venki@...gle.com>,
	Chase Douglas <chase.douglas@...onical.com>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org, Kyle McMartin <kyle@...artin.ca>,
	tmhikaru@...il.com
Subject: Re: High CPU load when machine is idle (related to PROBLEM: Unusually high load average when idle in 2.6.35, 2.6.35.1 and later)

On Wed, Dec 01, 2010 at 04:27:38PM -0500, tmhikaru@...il.com wrote:
> On Tue, Nov 30, 2010 at 03:59:05PM +0100, Peter Zijlstra wrote:
> > On Tue, 2010-11-30 at 00:01 +0100, Peter Zijlstra wrote:
> > > 
> > > Ok, that's good testing.. so its still not quite the same as NO_HZ=n,
> > > how about this one?
> > > 
> > > (it seems to drop down to 0.00 if I wait a few minutes with top -d5)
> > 
> > OK, so here's a less crufty patch that gets the same result on my
> > machine, load drops down to 0.00 after a while.
> > 
> > It seems a bit slower to reach 0.00, but that could be because I
> > actually changed the load computation for NO_HZ=n as well, I added a
> > rounding factor in calc_load(), we no longer truncate the division.
> > 
> > If people want to compare, simply remove the third line from
> > calc_load(): load += 1UL << (FSHIFT - 1), to restore the old behaviour.
> 
> For some bizzare reason, this version has a small but noticable amount of
> jitter and never really seems to hit 0.00 on my machine, tends to jump
> around at low values between 0.03 to 0.08 on a routine basis:
> 
> 16:20:42 up 16:31,  4 users,  load average: 0.00, 0.01, 0.05
> 
> the jitter seems to have no visible reason for it happening; with no
> networking, disk access or a process waking up and demanding attention from
> the cpu, it goes back up.
> 
> Mind this is obviously NOT as horrible as it was originally, but I'd like to
> find out why it's acting so differently.
> 
> I'm going to try this variant again with that line you were talking about
> disabled and see if it reacts differently. I get the feeling if it's the
> rounding factor - since you say that was changed for BOTH nohz=y and n, that
> it's not really a problem in the first place, and likely is very low load
> that wasn't being accurately reported before.

Indeed, this seems to be the case:

04:50:14 up  5:45,  5 users,  load average: 0.00, 0.00, 0.00

the average seems to not be jittery, or at least noticably, and reacts as I
have expected it to in the past with that single line disabled; Since you
have said that this change would affect all load calculations I have not
tested how this patch with the line enabled/disabled reacts with nohz=n,
please let me know if you would like me to test that condition anyway.

Personally since it changes the previous behavior of the load calculation
I'd prefer that the rounding not be done.

Tim McGrath
Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ