lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 9 Nov 2021 13:29:35 +0100
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Zhaoyang Huang <huangzhaoyang@...il.com>,
        Xuewen Yan <xuewen.yan94@...il.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Zhaoyang Huang <zhaoyang.huang@...soc.com>,
        "open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        xuewen.yan@...soc.com, Ke Wang <Ke.Wang@...soc.com>
Subject: Re: [Resend PATCH] psi : calc cfs task memstall time more precisely

On 08/11/2021 10:20, Zhaoyang Huang wrote:
> On Mon, Nov 8, 2021 at 4:49 PM Xuewen Yan <xuewen.yan94@...il.com> wrote:
>>
>> Hi Dietmar
>>
>> On Sat, Nov 6, 2021 at 1:20 AM Dietmar Eggemann
>> <dietmar.eggemann@....com> wrote:
>>>
>>> On 05/11/2021 06:58, Zhaoyang Huang wrote:

[...]

>>>>> This will let the idle task (swapper) pass. Is this indented? Or do you
>>>>> want to only let CFS tasks (including SCHED_IDLE) pass?
>>>> idle tasks will NOT call psi_memstall_xxx. We just want CFS tasks to
>>>> scale the STALL time.
>>>
>>> Not sure I  get this.
>>>
>>> __schedule() -> psi_sched_switch() -> psi_task_change() ->
>>> psi_group_change() -> record_times() -> psi_memtime_fixup()
>>>
>>> is something else than calling psi_memstall_enter() or _leave()?
>>>
>>> IMHO, at least record_times() can be called with current equal
>>> swapper/X. Or is it that PSI_MEM_SOME is never set for the idle task in
>>> this callstack? I don't know the PSI internals.
> According to my understanding, PSI_MEM_SOME represents the CORE's
> state within which there is at least one task trapped in memstall
> path(only counted in by calling PSI_MEMSTALL_ENTER). record_times is
> responsible for collecting the delta time of the CORE since it start.
> What we are doing is to make the delta time more precise. So idle task
> is irrelevant for these.

Coming back to the original snippet of the patch.

static unsigned long psi_memtime_fixup(u32 growth)
{

    if (!(current->policy == SCHED_NORMAL ||
          current->policy == SCHED_BATCH))
        return growth_fixed;

With this condition:

(1) you're not bailing when current is the idle task. It has policy
    equal 0 (SCHED_NORMAL)

(2) But you're bailing for a SCHED_IDLE (CFS) task.

I'm not sure that this is indented here?

Since you want to do the scaling later based on whats left for CFS tasks
from the CPU capacity my hunch is that you want to rather do:

    if (current->sched_class != &fair_sched_class)
        return growth_fixed;

What's the possible sched classes of current in psi_memtime_fixup?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ