linux-kernel - Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120610174939.GA456@burratino>
Date:	Sun, 10 Jun 2012 12:49:39 -0500
From:	Jonathan Nieder <jrnieder@...il.com>
To:	Doug Smythies <dsmythies@...us.net>
Cc:	'Anders Boström' <anders@...insight.net>,
	linux-kernel@...r.kernel.org,
	'Lesław Kopeć' <leslaw.kopec@...za-klasa.pl>,
	'Aman Gupta' <aman@...1.net>,
	'Peter Zijlstra' <a.p.zijlstra@...llo.nl>,
	'Thomas Gleixner' <tglx@...utronix.de>,
	Charles Wang <muming.wq@...il.com>
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

Hi Doug et al,

Doug Smythies wrote:

> "does 556061b00c9f ("sched/nohz: Fix rq->cpu_load[] calculations",
> 2012-05-11) change anything?"
>
> I back edited those changes into my test environment yesterday. It
> made no difference with respect to this issue. (minimally tested.)
[...]
> By the way, I found and tested 5aaa0b7a2ed5b12692c9ffb5222182bd558d3146
> It is similar (minimally tested).
>
> I am certainly not an expert, and I find the load average area of the
> code extremely difficult to follow and understand. That being said, I
> think the root issue here is the 10 tick grace period. I think that
> cpu idle enter exit transitions can not be ignored during this period,
> and somehow needs to be accumulated towards the next sample time. So far,
> I have been unsuccessful trying to help with a suggested solution. I will
> continue to try.

Another load average related patch is being discussed (not meant
particularly to address the too-low load case, just mentioning it
FYI):

	sched: Folding nohz load accounting more accurate

	After patch 453494c3d4 (sched: Fix nohz load accounting -- again!), we can fold
	the idle into calc_load_tasks_idle between the last cpu load calculating and
	calc_global_load calling. However problem still exits between the first cpu 
	load calculating and the last cpu load calculating. Every time when we do load 
	calculating, calc_load_tasks_idle will be added into calc_load_tasks, even if
	the idle load is caused by calculated cpus. This problem is also described in
	the following link:

	https://lkml.org/lkml/2012/5/24/419

	This bug can be found in our work load. The average running processes number 
	is about 15, but the load only shows about 4.

>From [*].

Hope that helps,
Jonathan

[*] http://thread.gmane.org/gmane.linux.kernel/1310462
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/