linux-kernel - Re: [RFC PATCH] sched/fair: correct llc shared domain's number of busy CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtBJvd244jdXXvjRcjcFdtVEpTauDjUpAs-uzWdKafbZrw@mail.gmail.com>
Date:   Mon, 4 May 2020 15:29:41 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Hillf Danton <hdanton@...a.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Valentin Schneider <valentin.schneider@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [RFC PATCH] sched/fair: correct llc shared domain's number of
 busy CPUs

On Mon, 4 May 2020 at 13:41, Hillf Danton <hdanton@...a.com> wrote:
>
>
> On Mon, 4 May 2020 09:53:36 Vincent Guittot wrote:
> >
> > On Mon, 4 May 2020 at 03:57, Hillf Danton wrote:
> > >
> > > The comment says, if there is an imbalance between LLC domains (IOW we
> > > could increase the overall cache use),  we need some less-loaded LLC
> > > domain to pull some load.
> > >
> > > To show that imbalance, record busy CPUs as they come and go by doing
> > > a minor cleanup for sd::nohz_idle.
> >
> > Your comment failed to explain why we can get rid of sd->nohz_idle
> >
> The serialization added in 25f55d9d01ad ("sched: Fix init NOHZ_IDLE flag") to
> updating nr_busy_cpus is no longer needed after 0e369d757578 ("sched/core:
> Replace sd_busy/nr_busy_cpus with sched_domain_shared") AFAICT because a

I don't see the link between commit 0e369d757578 and the fact that we
can remove the nohz_idle field.

> recorded idle/busy CPU does not mean the current CPU could not become idle or
> busy. The right thing is to update the counter if we have a valid sd under rcu.

No it's not the root cause because the sd is per cpu so each cpu has
its own sd->nohz_idle so if cpu A set sd->nohz_idle, cpu B will not be
impact and will have to set its own.

We must ensure that nr_busy_cpus is inc/dec only once when
transitioning from/to idle/busy state in order to keep the shared
nr_busy_cpus correct. But set_cpu_sd_state_busy() is called from
scheduler_tick() which means potentially every tick:

scheduler_tick() -> trigger_load_balance() -> nohz_balancer_kick() ->
nohz_balance_exit_idle() -> set_cpu_sd_state_busy()

The nohz_idle field is there to prevent incrementing nr_busy_cpus at
every tick. But set_cpu_sd_state_busy() is called from
nohz_balance_exit_idle() since 00357f5ec5d6 ("sched/nohz: Clean up
nohz enter/exit") and the latter has a similar mechanism with
rq->nohz_tick_stopped so sd_llc->nohz_idle is useless

>
> > you remove the use of sd->nohz_idle but you don't remove it from
> > struct sched_domain
>
> A seperate cleanup for it is needed if it's no longer used somewhere else.

Please remove it in the same patch

>
> Hillf
>