lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <5549E3B6.2060709@arm.com>
Date:	Wed, 06 May 2015 10:49:42 +0100
From:	Dietmar Eggemann <dietmar.eggemann@....com>
To:	"pang.xunlei@....com.cn" <pang.xunlei@....com.cn>
CC:	Juri Lelli <Juri.Lelli@....com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-kernel-owner@...r.kernel.org" 
	<linux-kernel-owner@...r.kernel.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	Morten Rasmussen <Morten.Rasmussen@....com>,
	"mturquette@...aro.org" <mturquette@...aro.org>,
	"nico@...aro.org" <nico@...aro.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"preeti@...ux.vnet.ibm.com" <preeti@...ux.vnet.ibm.com>,
	"rjw@...ysocki.net" <rjw@...ysocki.net>,
	"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
	"yuyang.du@...el.com" <yuyang.du@...el.com>
Subject: Re: [RFCv3 PATCH 12/48] sched: Make usage tracking cpu scale-invariant

On 03/05/15 07:27, pang.xunlei@....com.cn wrote:
> Hi Dietmar,
> 
> Dietmar Eggemann <dietmar.eggemann@....com>  wrote 2015-03-24 AM 03:19:41:
>>
>> Re: [RFCv3 PATCH 12/48] sched: Make usage tracking cpu scale-invariant

[...]

>> In the previous patch-set https://lkml.org/lkml/2014/12/2/332we
>> cpu-scaled both (sched_avg::runnable_avg_sum (load) and
>> sched_avg::running_avg_sum (utilization)) but during the review Vincent
>> pointed out that a cpu-scaled invariant load signal messes up
>> load-balancing based on s[dg]_lb_stats::avg_load in overload scenarios.
>>
>> avg_load = load/capacity and load can't be simply replaced here by
>> 'cpu-scale invariant load' (which is load*capacity).
> 
> I can't see why it shouldn't.
> 
> For "avg_load = load/capacity", "avg_load" stands for how busy the cpu
> works,
> it is actually a value relative to its capacity. The system is seen
> balanced
> for the case that a task runs on a 512-capacity cpu contributing 50% usage,
> and two the same tasks run on the 1024-capacity cpu contributing 50% usage.
> "capacity" in this formula contains uarch capacity, "load" in this formula
> must be an absolute real load, not relative.
> 
> But with current kernel implementation, "load" computed without this patch
> is a relative value. For example, one task (1024 weight) runs on a 1024
> capacity CPU, it gets 256 load contribution(25% on this CPU). When it runs
> on a 512 capacity CPU, it will get the 512 load contribution(50% on ths
> CPU).
> See, currently runnable "load" is relative, so "avg_load" is actually wrong
> and its value equals that of "load". So I think the runnable load should be
> made cpu scale-invariant as well.
> 
> Please point me out if I was wrong.

Cpu-scaled load leads to wrong lb decisions in overload scenarios:

(1) Overload example taken from email thread between Vincent and Morten:
    https://lkml.org/lkml/2014/12/30/114

7 always running tasks, 4 on cluster 0, 3 on cluster 1:

		cluster 0	cluster 1
capacity	1024 (2*512)	1024 (1*1024)
load		4096		3072
scale_load	2048		3072

Simply using cpu-scaled load in the existing lb code would declare
cluster 1 busier than cluster 0, although the compute capacity budget
for one task is higher on cluster 1 (1024/3 = 341) than on cluster 0
(2*512/4 = 256).

(2) A non-overload example does not show this problem:

7 12.5% (scaled to 1024) tasks, 4 on cluster 0, 3 on cluster 1:

		cluster 0	cluster 1
capacity	1024 (2*512)	1024 (1*1024)
load		1024		384
scale_load	512		384

Here cluster 0 is busier taking load or cpu-scaled load.

We should continue to use avg_load based on load (maybe calculated out
of scaled load once introduced?) for overload scenarios and use
scale_load for non-overload scenarios. Since this hasn't been
implemented yet, we got rid of cpu-scaled load in
this RFC.

[...]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ