linux-kernel - Re: [PATCHv4 00/12] sched/fair: Migrate 'misfit' tasks on asymmetric capacity systems

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7bad2238-369c-f61c-0a51-eb14204e0429@arm.com>
Date:   Mon, 30 Jul 2018 16:30:27 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Valentin Schneider <valentin.schneider@....com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, gaku.inami.xh@...esas.com,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCHv4 00/12] sched/fair: Migrate 'misfit' tasks on asymmetric
 capacity systems

On 07/26/2018 07:14 PM, Valentin Schneider wrote:
> Hi,
> 
> On 09/07/18 16:08, Morten Rasmussen wrote:
>> On Fri, Jul 06, 2018 at 12:18:27PM +0200, Vincent Guittot wrote:
>>> Hi Morten,
>>>
>>> On Wed, 4 Jul 2018 at 12:18, Morten Rasmussen <morten.rasmussen@....com> wrote:

[...]

> With that out of the way, I did some lmbench runs:
>> lat_mem_rd 10 1024
> 
> With ASYM_PACKING, I still see lmbench tasks remaining on LITTLE CPUs while
> bigs are free, because ASYM_PACKING only does explicit active balancing on
> CPU_NEWLY_IDLE balancing - otherwise it'll rely on the nr_balance_failed counter.
> 
> However, that counter can be reset before it reaches the threshold at which
> active balance is done, which can lead to huge upmigration delays (almost a
> full second). I also see the same kind of issues on Juno r0.
> 
> This could be resolved by extending ASYM_PACKING active balancing to
> non NEWLY_IDLE cases, but then we'd be thrashing everything. That's another
> argument for basing upmigration on task load-tracking signals, as we can
> determine which tasks need active balancing much faster than the
> nr_balance_failed counter way while not active balancing the world.

The task layout of the test looks like n=85 always running tasks (each 
for ~ 1.25ms on big or little) and they all get created and run one 
after the other. So on a big cpu, their util values go from 512 to 1024 
and from 223 to 446 on little cpu (Juno board). Latter thanks to 
Quentin's 'sched/fair: Fix util_avg of new tasks for asymmetric systems'.

root@...o:~# cat /sys/devices/system/cpu/cpu[01]/cpu_capacity
446
1024

> (lat_mem_rd 10 1024) with ASYM_PACKING:

...
> 4.0 148.66   <-----
> 4.5 10.191
...
> 7.5 10.203
> 8.0 154.354   <-----

I ran the test affine to big, little and all cpus on tip/sched/core w/o 
ASYM_PACKING or Misfit:

cputype:     big  little     all
cpumask:    0x06    0x39    0xff

mem size   <---- latency  ---->

  0.00098   3.668   3.595   3.669
  0.00195   3.668   3.594   3.594
  0.00293   3.668   3.593   3.595
  0.00391   3.669   3.596   3.595
  ...
  3.75000  58.687 121.934 122.293
  4.00000  57.054 121.771 120.489
  4.50000  56.914 121.851  56.729
  5.00000  57.347 121.777  56.975
  5.50000  57.705 121.738  68.981
  6.00000  57.935 121.728  57.542
  6.50000  58.119 121.694 121.799
  7.00000  58.194 121.502  57.844
  7.50000  58.258 121.684  58.050
  8.00000  58.293 121.725  58.030
  9.00000  58.309 121.793  58.188
10.00000  58.561 122.252 122.078

There is no diff between big and little cpus with small memory sizes, 
just with the MB range.
If I look into the trace for 'all' it turns out that their are cases in 
which, even if the task only run for ~15% of the time on big, the 
latency value is printed as when it was running affine to big. So using 
the latency value as an indicator where the task was scheduled is IMHO 
not really possible.