linux-kernel - Re: [PATCH 1/7] sched: Introduce scale-invariant load tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtAefmLDsEhWgri=urSmHQ-sydqcYDGX08dN982Lav1sGw@mail.gmail.com>
Date:	Wed, 8 Oct 2014 16:08:04 +0200
From:	Vincent Guittot <vincent.guittot@...aro.org>
To:	Morten Rasmussen <morten.rasmussen@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	Dietmar Eggemann <Dietmar.Eggemann@....com>,
	Paul Turner <pjt@...gle.com>,
	Benjamin Segall <bsegall@...gle.com>,
	Nicolas Pitre <nicolas.pitre@...aro.org>,
	Mike Turquette <mturquette@...aro.org>,
	"rjw@...ysocki.net" <rjw@...ysocki.net>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/7] sched: Introduce scale-invariant load tracking

On 8 October 2014 15:53, Morten Rasmussen <morten.rasmussen@....com> wrote:
> On Wed, Oct 08, 2014 at 12:21:45PM +0100, Vincent Guittot wrote:
>> On 8 October 2014 13:00, Morten Rasmussen <morten.rasmussen@....com> wrote:

>> >
>> > Sure. The easiest way to avoid introducing overflows is to ensure that
>> > we always scale by a factor >= 1.0. That should be true as long as
>> > arch_scale_{cpu,freq}_capacity() never returns anything greater than
>> > SCHED_CAPACITY_SCALE (= 1024 = 1.0).
>>
>> the current ARM arch_scale_cpu is in the range [1536..0] which is free
>> of overflow AFAICT
>
> If I'm not mistaken, that will cause an overflow in
> __update_task_entity_contrib():
>
> static inline void __update_task_entity_contrib(struct sched_entity *se)
> {
>         u32 contrib;
>         /* avoid overflowing a 32-bit type w/ SCHED_LOAD_SCALE */
>         contrib = se->avg.runnable_avg_sum * scale_load_down(se->load.weight);
>         contrib /= (se->avg.avg_period + 1);
>         se->avg.load_avg_contrib = scale_load(contrib);
> }
>
> With arch_scale_cpu_capacity() > 1024 se->avg.runnable_avg_sum is no
> longer bounded by LOAD_AVG_MAX = 47742. scale_load_down(se->load.weight)
> == se->load.weight =< 88761.
>
>         47742 * 88761 = 4237627662 (2^32 = 4294967296)
>
> To avoid overflow se->avg.runnable_avg_sum must be less than 2^32/88761
> = 48388, which means that arch_scale_cpu_capacity() must be in the range
> 0..48388*1024/47742 = 0..1037.
>
> I also think it is easier to have a fixed defined max scaling factor,
> but that might just be me.

OK,  overflow comes with adding uarch invariance into runnable load average

>
> Regarding the ARM arch_scale_cpu_capacity() implementation, I think that
> can be changed to fit the 0..1024 range easily. Currently, it will only
> report a non-default (1024) capacity for big.LITTLE systems and actually
> enabling it (requires a certain property to be set in device tree) leads
> to broken load-balancing decisions. We have discussed that several times

Only the 1 task per CPU is broken but in the other hand, it better
handles the overload use case where we have more tasks than CPU and
other middle range use case by putting more task on big cluster.

> in the past. I wouldn't recommend enabling it until the load-balance
> code can deal with big.LITTLE compute capacities correctly. This is also
> why it isn't implemented by ARM64.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/