linux-kernel - RE: [PATCH] sched: loadavg 0.00, 0.01, 0.05 on idle, 1.00, 0.99, 0.95 on full load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <001b01d1a1ab$81156e80$83404b80$@net>
Date:	Thu, 28 Apr 2016 17:10:13 -0700
From:	"Doug Smythies" <dsmythies@...us.net>
To:	"'Vik Heyndrickx'" <vik.heyndrickx@...ibox.net>,
	<a.p.zijlstra@...llo.nl>
Cc:	<linux-kernel@...r.kernel.org>,
	"'Damien Wyart'" <damien.wyart@...e.fr>
Subject: RE: [PATCH] sched: loadavg 0.00, 0.01, 0.05 on idle, 1.00, 0.99, 0.95 on full load

On 2016.04.28 11:46 Vik Heyndrickx wrote:

> Hi Peter,
>
>Systems show a minimal load average of 0.00, 0.01, 0.05 even when they 
> have no load at all.
>
> Uptime and /proc/loadavg on all systems with kernels released during the 
> last five years up until kernel version 4.6-rc5, show a 5- and 15-minute 
> minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on 
> idle systems, but the way the kernel calculates this value prevents it 
> from getting lower than the mentioned values.
>
> Likewise but not as obviously noticeable, a fully loaded system with no 
> processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95 
> (multiplied by number of cores).
>
> Once the (old) load becomes 93 or higher, it mathematically can never
> get lower than 93, even when the active (load) remains 0 forever.
> This results in the strange 0.00, 0.01, 0.05 uptime values on idle
> systems.  Note: 93/2048 = 0.0454..., which rounds up to 0.05.
>
> It is not correct to add a 0.5 rounding (=1024/2048) here, since the
> result from this function is fed back into the next iteration again,
> so the result of that +0.5 rounding value then gets multiplied by
> (2048-2037), and then rounded again, so there is a virtual "ghost"
> load created, next to the old and active load terms.
>
> By changing the way the internally kept value is rounded, that internal 
> value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon 
> increasing load, the internally kept load value is rounded up, when the 
> load is decreasing, the load value is rounded down.
>
> The modified code was tested on nohz=off and nohz kernels. It was tested 
> on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was 
> tested on single, dual, and octal cores system. It was tested on virtual 
> hosts and bare hardware. No unwanted effects have been observed, and the 
> problems that the patch intended to fix were indeed gone.
>
> Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes")
> Cc: Doug Smythies <dsmythies@...us.net>
> Tested-by: Damien Wyart <damien.wyart@...e.fr>
> Signed-off-by: Vik Heyndrickx <vik.heyndrickx@...ibox.net>
>
> --- kernel/sched/loadavg.c.orig	2016-04-25 01:17:05.000000000 +0200
> +++ kernel/sched/loadavg.c	2016-04-28 16:47:47.754266136 +0200
> @@ -99,10 +99,12 @@ long calc_load_fold_active(struct rq *th
>   static unsigned long
>   calc_load(unsigned long load, unsigned long exp, unsigned long active)
>   {
> -	load *= exp;
> -	load += active * (FIXED_1 - exp);
> -	load += 1UL << (FSHIFT - 1);
> -	return load >> FSHIFT;
> +	long unsigned newload;
> +	
> +	newload = load * exp + active * (FIXED_1 - exp);
> +	if (active >= load)
> +		newload += FIXED_1-1;
> +	return newload / FIXED_1;	
>   }
>
>  #ifdef CONFIG_NO_HZ_COMMON

See also: https://bugzilla.kernel.org/show_bug.cgi?id=45001
I also tested this patch on 2016.01.22. It works fine.