linux-kernel - Re: [PATCH] sched/fair: Make PELT signal more accurate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtB_xZj9Keea4ZSLVqgpMbjDBW7tPFaNZ5mDKN7Dt_0yxg@mail.gmail.com>
Date:   Mon, 7 Aug 2017 15:24:15 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Joel Fernandes <joelaf@...gle.com>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        kernel-team@...roid.com, Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@....com>,
        Brendan Jackman <brendan.jackman@....com>,
        Dietmar Eggeman <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH] sched/fair: Make PELT signal more accurate

Hi Joel,

On 4 August 2017 at 17:40, Joel Fernandes <joelaf@...gle.com> wrote:
> The PELT signal (sa->load_avg and sa->util_avg) are not updated if the amount
> accumulated during a single update doesn't cross a period boundary. This is
> fine in cases where the amount accrued is much smaller than the size of a
> single PELT window (1ms) however if the amount accrued is high then the
> relative error (calculated against what the actual signal would be had we
> updated the averages) can be quite high - as much 3-6% in my testing. On
> plotting signals, I found that there are errors especially high when we update
> just before the period boundary is hit. These errors can be significantly
> reduced if we update the averages more often.
>
> Inorder to fix this, this patch does the average update by also checking how
> much time has elapsed since the last update and update the averages if it has
> been long enough (as a threshold I chose 128us).

Why 128us and not 512us as an example ?

128us threshold means that util/load_avg can be computed 8 times more
often and this means up to 16 times more call to div_u64

>
> In order to compare the signals with/without the patch I created a synthetic
> test (20ms runtime, 100ms period) and analyzed the signals and created a report
> on the analysis data/plots both with and without the fix:
> http://www.linuxinternals.org/misc/pelt-error.pdf

The glitch described in page 2 shows a decrease of the util_avg which
is not linked to accuracy of the calculation but due to the use of the
wrong range when computing util_avg.
commit  625ed2bf049d "sched/cfs: Make util/load_avg more stable" fixes
this glitch.
And the lower peak value in page 3 is probably linked to the inaccuracy
I agree that there is an inaccuracy (the max absolute value of 22) but
that's in favor of less overhead. Have you seen wrong behavior because
of this inaccuracy ?

>
> With the patch, the error in the signal is significantly reduced, and is
> non-existent beyond a small negligible amount.
>
> Cc: Vincent Guittot <vincent.guittot@...aro.org>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Juri Lelli <juri.lelli@....com>
> Cc: Brendan Jackman <brendan.jackman@....com>
> Cc: Dietmar Eggeman <dietmar.eggemann@....com>
> Signed-off-by: Joel Fernandes <joelaf@...gle.com>
> ---
>  kernel/sched/fair.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4f1825d60937..1347643737f3 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2882,6 +2882,7 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
>                   unsigned long weight, int running, struct cfs_rq *cfs_rq)
>  {
>         u64 delta;
> +       int periods;
>
>         delta = now - sa->last_update_time;
>         /*
> @@ -2908,9 +2909,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
>          * accrues by two steps:
>          *
>          * Step 1: accumulate *_sum since last_update_time. If we haven't
> -        * crossed period boundaries, finish.
> +        * crossed period boundaries and the time since last update is small
> +        * enough, we're done.
>          */
> -       if (!accumulate_sum(delta, cpu, sa, weight, running, cfs_rq))
> +       periods = accumulate_sum(delta, cpu, sa, weight, running, cfs_rq);
> +
> +       if (!periods && delta < 128)
>                 return 0;
>
>         /*
> --
> 2.14.0.rc1.383.gd1ce394fe2-goog
>