linux-kernel - Re: [PATCH v5 2/2] sched/fair: update scale invariance of PELT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAKfTPtCXPSUXE_zL9Paoazv0EmXStv7ZUtG6JJqZb5o0Zfg9vw@mail.gmail.com>
Date:   Thu, 8 Nov 2018 17:04:45 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Quentin Perret <quentin.perret@....com>
Cc:     Dietmar Eggemann <dietmar.eggemann@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Morten Rasmussen <Morten.Rasmussen@....com>,
        Patrick Bellasi <patrick.bellasi@....com>,
        Paul Turner <pjt@...gle.com>, Ben Segall <bsegall@...gle.com>,
        Thara Gopinath <thara.gopinath@...aro.org>,
        pkondeti@...eaurora.org
Subject: Re: [PATCH v5 2/2] sched/fair: update scale invariance of PELT

On Thu, 8 Nov 2018 at 12:35, Quentin Perret <quentin.perret@....com> wrote:
>
> On Wednesday 07 Nov 2018 at 11:47:09 (+0100), Dietmar Eggemann wrote:
> > The important bit for EAS is that it only uses utilization in the
> > non-overutilized case. Here, utilization signals should look the same
> > between the two approaches, not considering tasks with long periods like the
> > 39/80ms example above.
> > There are also some advantages for EAS with time scaling: (1) faster
> > overutilization detection when a big task runs on a little CPU, (2) higher
> > (initial) task utilization value when this task migrates from little to big
> > CPU.
>
> Agreed, these patches should help detecting the over-utilized scenarios
> faster and more reliably, which is probably a good thing. I'll try to
> have a look in more details soon.
>
> > We should run our EAS task placement tests with your time scaling patches.
>
> Right, I tried these patches with the synthetic tests we usually run
> against our upstream EAS dev branch (see [1]), and I couldn't see any
> regression, which is good sign :-)

Thanks for testing

>
>
> <slightly off topic>
> Since most people are probably not familiar with these tests, I'll try
> to elaborate a little bit more. They are unit tests aimed to stress
> particular behaviours of the scheduler on asymmetric platforms. More
> precisely, they check that capacity-awareness/misfit and EAS are
> actually able to up-migrate and down-migrate tasks between big and
> little CPUs when necessary.
>
> The tests are based on rt-app and ftrace. They basically run a whole lot
> of scenarios with rt-app (small tasks, big tasks, a mix of both, tasks
> changing behaviour, ramping up, ramping down, ...), pull a trace of the
> execution and check that:
>
>    1. the task(s) did not miss activations (which will basically be true
>       only if the scheduler managed to provide each task with enough CPU
>       capacity). We call that one 'test_slack';
>
>    2. the task placement is close enough to the optimal placement
>       energy-wise (which is computed off-line using the energy model
>       and the rt-app conf). We call that one 'test_task_placement'.
>
> For example, in order to pass the test, a periodic task that ramps up
> from 10% to 70% over (say) 5s should probably start its execution on
> little CPUs to not waste energy, and get up-migrated to big CPUs later
> on to not miss activations. Otherwise one of the two checks will fail.
>
> I'd like to emphasize that these test scenarios are *not* supposed to
> look like real workloads at all. They've be design with the sole purpose
> of stressing specific code paths of the scheduler to spot any obvious
> breakage. They've proven quite useful for us in the past.
>
> All the tests are publicly available in the LISA repo [2].
> </slightly off topic>
>
>
> So, to come back to Vincent's patches, I managed to get 10/10 pass rate
> to most of the tests referred to as 'generic' in [1] on my Juno r0. The
> kernel I tested had Morten's misfit patches, the EAS patches v8, and
> Vincent's patches on top.
>
> Although I still need to really get my head around all the implications
> of changing PELT like that, I cannot see any obvious red flags from the
> testing perspective here.
>
> Thanks,
> Quentin
>
> ---
> [1] https://developer.arm.com/open-source/energy-aware-scheduling/eas-mainline-development
> [2] https://github.com/ARM-software/lisa