lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 May 2021 12:33:35 +0200
From:   Odin Ugedal <odin@...d.al>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Odin Ugedal <odin@...d.al>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        "open list:CONTROL GROUP (CGROUP)" <cgroups@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/3] sched/fair: Add tg_load_contrib cfs_rq decay checking

Hi,

tir. 25. mai 2021 kl. 11:58 skrev Vincent Guittot <vincent.guittot@...aro.org>:
> Could you give more details about how cfs_rq->avg.load_avg = 4 but
> cfs_rq->avg.load_sum = 0 ?
>
> cfs_rq->avg.load_sum is decayed and can become null when crossing
> period which implies an update of cfs_rq->avg.load_avg.  This means
> that your case is generated by something outside the pelt formula ...
> like maybe the propagation of load in the tree. If this is the case,
> we should find the error and fix it

Ahh, yeah, that could probably be described better.

It is (as far as I have found out) because the pelt divider is changed,
and the output from "get_pelt_divider(&cfs_rq->avg)" is changed, resulting
in a different value being removed than added.

Inside pelt itself, this cannot happen. When pelt changes the load_sum, it
recalculates the load_avg based on load_sum, and not the delta, afaik.

And as you say, the "issue" therefore (as I see it) outside of PELT. Due to
how the pelt divider is changed, I assume it is hard to pinpoint where the issue
is. I can try to find a clear path where where we can see what is added
and what is removed from both cfs_rq->avg.load_sum and cfs_rq->avg.load_avg,
to better be able to pinpoint what is happening.

Previously I thought this was a result of precision loss due to division and
multiplication during load add/remove inside fair.c, but I am not sure that
is the issue, or is it?

If my above line of thought makes sense, do you still view this as an error
outside PELT, or do you see another possible/better solution?

Will investigate further.

Thanks
Odin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ