lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181108113532.vh7q3s3k7vpbevl3@queper01-lin>
Date:   Thu, 8 Nov 2018 11:35:34 +0000
From:   Quentin Perret <quentin.perret@....com>
To:     Dietmar Eggemann <dietmar.eggemann@....com>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Morten Rasmussen <Morten.Rasmussen@....com>,
        Patrick Bellasi <patrick.bellasi@....com>,
        Paul Turner <pjt@...gle.com>, Ben Segall <bsegall@...gle.com>,
        Thara Gopinath <thara.gopinath@...aro.org>,
        pkondeti@...eaurora.org
Subject: Re: [PATCH v5 2/2] sched/fair: update scale invariance of PELT

On Wednesday 07 Nov 2018 at 11:47:09 (+0100), Dietmar Eggemann wrote:
> The important bit for EAS is that it only uses utilization in the
> non-overutilized case. Here, utilization signals should look the same
> between the two approaches, not considering tasks with long periods like the
> 39/80ms example above.
> There are also some advantages for EAS with time scaling: (1) faster
> overutilization detection when a big task runs on a little CPU, (2) higher
> (initial) task utilization value when this task migrates from little to big
> CPU.

Agreed, these patches should help detecting the over-utilized scenarios
faster and more reliably, which is probably a good thing. I'll try to
have a look in more details soon.

> We should run our EAS task placement tests with your time scaling patches.

Right, I tried these patches with the synthetic tests we usually run
against our upstream EAS dev branch (see [1]), and I couldn't see any
regression, which is good sign :-)


<slightly off topic>
Since most people are probably not familiar with these tests, I'll try
to elaborate a little bit more. They are unit tests aimed to stress
particular behaviours of the scheduler on asymmetric platforms. More
precisely, they check that capacity-awareness/misfit and EAS are
actually able to up-migrate and down-migrate tasks between big and
little CPUs when necessary.

The tests are based on rt-app and ftrace. They basically run a whole lot
of scenarios with rt-app (small tasks, big tasks, a mix of both, tasks
changing behaviour, ramping up, ramping down, ...), pull a trace of the
execution and check that:

   1. the task(s) did not miss activations (which will basically be true
      only if the scheduler managed to provide each task with enough CPU
      capacity). We call that one 'test_slack';

   2. the task placement is close enough to the optimal placement
      energy-wise (which is computed off-line using the energy model
      and the rt-app conf). We call that one 'test_task_placement'.

For example, in order to pass the test, a periodic task that ramps up
from 10% to 70% over (say) 5s should probably start its execution on
little CPUs to not waste energy, and get up-migrated to big CPUs later
on to not miss activations. Otherwise one of the two checks will fail.

I'd like to emphasize that these test scenarios are *not* supposed to
look like real workloads at all. They've be design with the sole purpose
of stressing specific code paths of the scheduler to spot any obvious
breakage. They've proven quite useful for us in the past.

All the tests are publicly available in the LISA repo [2].
</slightly off topic>


So, to come back to Vincent's patches, I managed to get 10/10 pass rate
to most of the tests referred to as 'generic' in [1] on my Juno r0. The
kernel I tested had Morten's misfit patches, the EAS patches v8, and
Vincent's patches on top.

Although I still need to really get my head around all the implications
of changing PELT like that, I cannot see any obvious red flags from the
testing perspective here.

Thanks,
Quentin

---
[1] https://developer.arm.com/open-source/energy-aware-scheduling/eas-mainline-development
[2] https://github.com/ARM-software/lisa

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ