linux-kernel - Re: [PATCH 2/2] sched/fair: util_est: add running

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180605204608.GA3510@joelaf.mtv.corp.google.com>
Date:   Tue, 5 Jun 2018 13:46:08 -0700
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Patrick Bellasi <patrick.bellasi@....com>
Cc:     Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        connoro@...gle.com, Joel Fernandes <joelaf@...gle.com>,
        Steve Muckle <smuckle@...gle.com>, Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH 2/2] sched/fair: util_est: add running_sum tracking

On Tue, Jun 05, 2018 at 05:54:31PM +0100, Patrick Bellasi wrote:
> On 05-Jun 17:31, Juri Lelli wrote:
> > On 05/06/18 16:11, Patrick Bellasi wrote:
> > 
> > [...]
> > 
> > > If I run an experiment with your example above, while using the
> > > performance governor to rule out any possible scale invariance
> > > difference, here is what I measure:
> > > 
> > >    Task1 (40ms delayed by the following Task2):
> > >                                mean          std     max
> > >       running_avg        455.387449    22.940168   492.0
> > >       util_avg           433.233288    17.395477   458.0
> > > 
> > >    Task2 (waking up at same time of Task1 and running before):
> > >                                mean          std     max
> > >       running_avg        430.281834    22.405175   455.0
> > >       util_avg           421.745331    22.098873   456.0
> > > 
> > > and if I compare Task1 above with another experiment where Task1 is
> > > running alone:
> > > 
> > >    Task1 (running alone):
> > >                                mean          std     min
> > >       running_avg        460.257895    22.103704   460.0
> > >       util_avg           435.119737    17.647556   461.0
> > 
> > Wait, why again in this last case running_avg != util_avg? :)
> 
> I _think_ it's mostly due to the rouding errors we have because of the
> reasons I've explained in the reply to Joel:
> 
>    https://lkml.org/lkml/2018/6/5/559
>    20180605152156.GD32302@...0439-lin
> 
> at the end, while commenting about the division overhead.
> 
> I should try the above examples while tracking the full signal at
> ___update_load_avg() time.

Is that the only issue? I think if a CFS task is blocked by another CFS task
due to preemption, then with your patch we would account the CFS blocked time
as well into the blocked task's running utilization, which seems incorrect.
Or did I miss something?

thanks,

 - Joel