lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 May 2021 11:58:43 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Odin Ugedal <odin@...d.al>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        "open list:CONTROL GROUP (CGROUP)" <cgroups@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/3] sched/fair: Add tg_load_contrib cfs_rq decay checking

On Tue, 18 May 2021 at 14:54, Odin Ugedal <odin@...d.al> wrote:
>
> Make sure cfs_rq does not contribute to task group load avg when
> checking if it is decayed. Due to how the pelt tracking works,
> the divider can result in a situation where:
>
> cfs_rq->avg.load_sum = 0
> cfs_rq->avg.load_avg = 4

Could you give more details about how cfs_rq->avg.load_avg = 4 but
cfs_rq->avg.load_sum = 0 ?

cfs_rq->avg.load_sum is decayed and can become null when crossing
period which implies an update of cfs_rq->avg.load_avg.  This means
that your case is generated by something outside the pelt formula ...
like maybe the propagation of load in the tree. If this is the case,
we should find the error and fix it

> cfs_rq->avg.tg_load_avg_contrib = 4
>
> If pelt tracking in this case does not cross a period, there is no
> "change" in load_sum, and therefore load_avg is not recalculated, and
> keeps its value.
>
> If this cfs_rq is then removed from the leaf list, it results in a
> situation where the load is never removed from the tg. If that happen,
> the fiarness is permanently skewed.
>
> Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
> Signed-off-by: Odin Ugedal <odin@...d.al>
> ---
>  kernel/sched/fair.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3248e24a90b0..ceda53c2a87a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8004,6 +8004,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
>         if (cfs_rq->avg.runnable_sum)
>                 return false;
>
> +       if (cfs_rq->tg_load_avg_contrib)
> +               return false;
> +
>         return true;
>  }
>
> --
> 2.31.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ