[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120610174939.GA456@burratino>
Date: Sun, 10 Jun 2012 12:49:39 -0500
From: Jonathan Nieder <jrnieder@...il.com>
To: Doug Smythies <dsmythies@...us.net>
Cc: 'Anders Boström' <anders@...insight.net>,
linux-kernel@...r.kernel.org,
'Lesław Kopeć' <leslaw.kopec@...za-klasa.pl>,
'Aman Gupta' <aman@...1.net>,
'Peter Zijlstra' <a.p.zijlstra@...llo.nl>,
'Thomas Gleixner' <tglx@...utronix.de>,
Charles Wang <muming.wq@...il.com>
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle
Hi Doug et al,
Doug Smythies wrote:
> "does 556061b00c9f ("sched/nohz: Fix rq->cpu_load[] calculations",
> 2012-05-11) change anything?"
>
> I back edited those changes into my test environment yesterday. It
> made no difference with respect to this issue. (minimally tested.)
[...]
> By the way, I found and tested 5aaa0b7a2ed5b12692c9ffb5222182bd558d3146
> It is similar (minimally tested).
>
> I am certainly not an expert, and I find the load average area of the
> code extremely difficult to follow and understand. That being said, I
> think the root issue here is the 10 tick grace period. I think that
> cpu idle enter exit transitions can not be ignored during this period,
> and somehow needs to be accumulated towards the next sample time. So far,
> I have been unsuccessful trying to help with a suggested solution. I will
> continue to try.
Another load average related patch is being discussed (not meant
particularly to address the too-low load case, just mentioning it
FYI):
sched: Folding nohz load accounting more accurate
After patch 453494c3d4 (sched: Fix nohz load accounting -- again!), we can fold
the idle into calc_load_tasks_idle between the last cpu load calculating and
calc_global_load calling. However problem still exits between the first cpu
load calculating and the last cpu load calculating. Every time when we do load
calculating, calc_load_tasks_idle will be added into calc_load_tasks, even if
the idle load is caused by calculated cpus. This problem is also described in
the following link:
https://lkml.org/lkml/2012/5/24/419
This bug can be found in our work load. The average running processes number
is about 15, but the load only shows about 4.
>From [*].
Hope that helps,
Jonathan
[*] http://thread.gmane.org/gmane.linux.kernel/1310462
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists