linux-kernel - Re: [PATCH 2/2] sched/fair: util_est: add running

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20180606103809.GG31675@e110439-lin>
Date:   Wed, 6 Jun 2018 11:38:09 +0100
From:   Patrick Bellasi <patrick.bellasi@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Steve Muckle <smuckle@...gle.com>, Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH 2/2] sched/fair: util_est: add running_sum tracking

Hi Vincent,

On 06-Jun 10:26, Vincent Guittot wrote:

[...]

> For the above 2 tasks of the example example we have the pattern
> 
> Task 1
> state       RRRRSSSSSSERRRRSSSSSSERRRRSSSSSS
> util_avg    AAAADDDDDD AAAADDDDDD AAAADDDDDD
> 
> Task 2
> state       WWWWRRRRSSEWWWWRRRRSSEWWWWRRRRSS
> util_avg    DDDDAAAADD DDDDAAAADD DDDDAAAADD
> running_avg     AAAADDC    AAAADDC    AAAADD
> 
> R : Running 1ms, S: Sleep 1ms , W: Wait 1ms, E: Enqueue event 
> A: Accumulate 1ms, D: Decay 1ms, C: Copy util_avg value
> 
> the util_avg of T1 and T2 have the same pattern which is:
>   AAAADDDDDDAAAADDDDDDAAAADDDDDD
> and as a result, the same value which represents their utilization
> 
> For the running_avg of T2, there is only 2 decays aftert the last running
> phase and before the next enqueue
> so the pattern will be AAAADDAAAA
> instead of the         AAAADDDDDDAAAA
> 
> the runninh_avg will report a higher value than reality

Right!... Your example above is really useful, thanks.

Reasoning on the same line, we can easily see that a 50% CFS task
co-scheduled with a 50% RT task, which delays the CFS one and has the
same period,  will make the CFS task appear as a 100% task.

Which is definitively not what we want to sample as estimated
utilization.

The good news, if we like, is instead that util_avg is already
correctly reporting 50% at the end of each activation and thus, when
we collect util_est samples we already have the best utilization value
we can collect for a task.

The only time we collect "wrong" estimation samples is when there
is not idle time, thus eventually util_est should be improved by
discarding samples in that cases... but I'm not entirely sure if and
how we can detect them.

Or just to ensure we have idle time... as you are proposing in
the other thread.

Thanks again for pointing out the issue above.

-- 
#include <best/regards.h>

Patrick Bellasi