linux-kernel - Re: Inconsistent load average on tickless kernels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4F551ABE.5080605@nasza-klasa.pl>
Date:	Mon, 05 Mar 2012 20:57:50 +0100
From:	Lesław Kopeć <leslaw.kopec@...za-klasa.pl>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Aman Gupta <aman@...1.net>, linux-kernel@...r.kernel.org,
	Chase Douglas <chase.douglas@...onical.com>,
	Damien Wyart <damien.wyart@...e.fr>,
	Kyle McMartin <kyle@...hat.com>,
	Venkatesh Pallipadi <venki@...gle.com>,
	Jonathan Nieder <jrnieder@...il.com>
Subject: Re: Inconsistent load average on tickless kernels

On 29.02.2012 13:06, Peter Zijlstra wrote:

> Missing here is a kernel build with CONFIG_NO_HZ but booted with
> nohz=off; this would be an interesting data point because it includes
> all the funny code but still ticks are the right frequency.

You've asked for it and you got it. I have rebooted some servers with
nohz=off parameter set on kernels complied with CONFIG_NO_HZ=y. They're
the ones listed below with 'off' suffix.

On 29.02.2012 17:24, Peter Zijlstra wrote:

> Hrmm, this suggests we age too hard with nohz code.. in your test case
> is there significant idle time? That is, suppose you run each cpu at 30%
> what is the period of you load? Running 3s out of 10s is significantly
> different from running .3ms out of 1ms.

It's definitely more similar to the second case - very frequent, but
short bursts of activity. A single process does a tiny bit of
computation mixed with a fair amount of network activity on each
request. There are 80 such processes which are responsible for majority
of system load.

On 29.02.2012 18:03, Peter Zijlstra wrote:

>> The only thing I could find is that on nohz we can confuse the per-rq
>> sample period, does the below make a difference? 
> 
> Uhm, something like so that is.. 
> 
> ---
>  kernel/sched/core.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d7c4322..44f61df 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2380,7 +2380,8 @@ static void calc_load_account_active(struct rq *this_rq)
>  	if (delta)
>  		atomic_long_add(delta, &calc_load_tasks);
>  
> -	this_rq->calc_load_update += LOAD_FREQ;
> +	while (!time_before(jiffies, this_rq->calc_load_update))
> +		this_rq->calc_load_update += LOAD_FREQ;
>  }
>  
>  /*
> 

I have compiled another batch of kernels with this patch applied
(they're the ones with 'patch0' suffix). The only difference was the
patch had to go to kernel/sched.c, but that's what you get when not
using the latest sources. Anyway, here are the results accompanied by a
pretty picture [1]:

					std	off	patch0
2.6.32.55-no-hz				0.76	0.91	-
2.6.32.55-no-hz-74f5187ac8		6.41	9.40	4.93
2.6.32.55-no-hz-0f004f5a69		0.78	0.92	0.90
2.6.37-rc5-no-hz-0f004f5a69		0.95	0.92	0.86
2.6.37-rc5-no-hz-pre-0f004f5a69		9.16	10.47	8.02

It seems that the patch didn't help much on kernels with 0f004f5a69
applied. The ones with just 74f5187ac8 are reporting a more plausible
values, but slightly lower than the ones without patch0. Am I right to
assume that the correct load values are the ones produced by kernels
complied with CONFIG_NO_HZ=n? Should they be the baseline?

I can run additional tests if you have other leads to follow. Is there a
particular kernel version I should focus on? If not I will continue
to use the current bundle. I'm also planning to give the latest stable
release a spin.

[1] http://img835.imageshack.us/img835/2204/kernelload.png

-- 
Lesław Kopeć

Download attachment "signature.asc" of type "application/pgp-signature" (263 bytes)