lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 4 Dec 2020 14:27:01 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Aubrey Li <aubrey.li@...ux.intel.com>,
        Barry Song <song.bao.hua@...ilicon.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Ziljstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Linux-ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH 06/10] sched/fair: Clear the target CPU from the cpumask
 of CPUs searched

On Fri, Dec 04, 2020 at 02:17:20PM +0100, Vincent Guittot wrote:
> On Fri, 4 Dec 2020 at 14:13, Vincent Guittot <vincent.guittot@...aro.org> wrote:
> >
> > On Fri, 4 Dec 2020 at 12:30, Mel Gorman <mgorman@...hsingularity.net> wrote:
> > >
> > > On Fri, Dec 04, 2020 at 11:56:36AM +0100, Vincent Guittot wrote:
> > > > > The intent was that the sibling might still be an idle candidate. In
> > > > > the current draft of the series, I do not even clear this so that the
> > > > > SMT sibling is considered as an idle candidate. The reasoning is that if
> > > > > there are no idle cores then an SMT sibling of the target is as good an
> > > > > idle CPU to select as any.
> > > >
> > > > Isn't the purpose of select_idle_smt ?
> > > >
> > >
> > > Only in part.
> > >
> > > > select_idle_core() looks for an idle core and opportunistically saves
> > > > an idle CPU candidate to skip select_idle_cpu. In this case this is
> > > > useless loops for select_idle_core() because we are sure that the core
> > > > is not idle
> > > >
> > >
> > > If select_idle_core() finds an idle candidate other than the sibling,
> > > it'll use it if there is no idle core -- it picks a busy sibling based
> > > on a linear walk of the cpumask. Similarly, select_idle_cpu() is not
> >
> > My point is that it's a waste of time to loop the sibling cpus of
> > target in select_idle_core because it will not help to find an idle
> > core. The sibling  cpus will then be check either by select_idle_cpu
> > of select_idle_smt
> 

I understand and you're right, the full loop was in the context of a series
that unified select_idle_* where it made sense. The version I'm currently
testing aborts the SMT search if a !idle sibling is encountered. That
means that select_idle_core() will no longer scan the entire domain if
there are no idle cores.

https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/commit/?h=sched-sissearch-v2r6&id=eb04a344cf7d7ca64c0c8fc0bcade261fa08c19e

With the patch on its own, it does mean that select_idle_sibling
starts over because SMT siblings might have been cleared. As an aside,
select_idle_core() has it's own problems even then.  It can start a scan
for an idle sibling when cpu_rq(target)->nr_running is very large --
over 100+ running tasks which is almost certainly a useless scan for
cores. However, I haven't done anything with that in this series as it
seemed like it would be follow-up work.

> also, while looping the cpumask, the sibling cpus of not idle cpu are
> removed and will not be check
> 

True and I spotted this. I think the load_balance_mask can be abused to
clear siblings during select_idle_core() while using select_idle_mask to
track CPUs that have not been scanned yet so select_idle_cpu only scans
CPUs that have not already been visited.

https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/commit/?h=sched-sissearch-v2r6&id=a6e986dae38855e3be26dfde86bbef1617431dd1

As both the idle candidate and the load_balance_mask abuse are likely to
be controversial, I shuffled the series so that it's ordered from least
least controversial to most controversial.

This
https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/log/?h=sched-sissearch-v2r6
is what is currently being tested. It'll take most of the weekend and I'll
post them properly if they pass tests and do not throw up nasty surprises.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ