lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 5 Jul 2017 13:22:05 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Jeffrey Hugo <jhugo@...eaurora.org>
Cc:     Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Austin Christ <austinwc@...eaurora.org>,
        Tyler Baicar <tbaicar@...eaurora.org>,
        Timur Tabi <timur@...eaurora.org>,
        Morten Rasmussen <morten.rasmussen@....com>
Subject: Re: [PATCH V5 2/2] sched/fair: Remove group imbalance from
 calculate_imbalance()

On Wed, Jun 07, 2017 at 01:18:58PM -0600, Jeffrey Hugo wrote:
> The group_imbalance path in calculate_imbalance() made sense when it was
> added back in 2007 with commit 908a7c1b9b80 ("sched: fix improper load
> balance across sched domain") because busiest->load_per_task factored into
> the amount of imbalance that was calculated. That is not the case today.

It would be nice to have some more information on which patch(es)
changed that.

> The group_imbalance path can only affect the outcome of
> calculate_imbalance() when the average load of the domain is less than the
> original busiest->load_per_task. In this case, busiest->load_per_task is
> overwritten with the scheduling domain load average. Thus
> busiest->load_per_task no longer represents actual load that can be moved.
> 
> At the final comparison between env->imbalance and busiest->load_per_task,
> imbalance may be larger than the new busiest->load_per_task causing the
> check to fail under the assumption that there is a task that could be
> migrated to satisfy the imbalance. However env->imbalance may still be
> smaller than the original busiest->load_per_task, thus it is unlikely that
> there is a task that can be migrated to satisfy the imbalance.
> Calculate_imbalance() would not choose to run fix_small_imbalance() when we
> expect it should. In the worst case, this can result in idle cpus.
> 
> Since the group imbalance path in calculate_imbalance() is at best a NOP
> but otherwise harmful, remove it.

load_per_task is horrible and should die. Ever since we did cgroup
support the number is complete crap, but even before that the concept
was dubious.

Most of the logic that uses the number stems from the pre-smp-nice era.

This also of course means that fix_small_imbalance() is probably a load
of crap. Digging through all that has been on the todo list for a long
while but somehow not something I've ever gotten to :/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ