linux-kernel - Re: [RFC PATCH 2/3] sched: change scheduler to give preference to soft affinity CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190718113758.GN3402@hirez.programming.kicks-ass.net>
Date:   Thu, 18 Jul 2019 13:37:58 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Subhra Mazumdar <subhra.mazumdar@...cle.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com, tglx@...utronix.de,
        prakash.sangappa@...cle.com, dhaval.giani@...cle.com,
        daniel.lezcano@...aro.org, vincent.guittot@...aro.org,
        viresh.kumar@...aro.org, tim.c.chen@...ux.intel.com,
        mgorman@...hsingularity.net, Paul Turner <pjt@...gle.com>
Subject: Re: [RFC PATCH 2/3] sched: change scheduler to give preference to
 soft affinity CPUs

On Wed, Jul 17, 2019 at 08:31:25AM +0530, Subhra Mazumdar wrote:
> 
> On 7/2/19 10:58 PM, Peter Zijlstra wrote:
> > On Wed, Jun 26, 2019 at 03:47:17PM -0700, subhra mazumdar wrote:
> > > The soft affinity CPUs present in the cpumask cpus_preferred is used by the
> > > scheduler in two levels of search. First is in determining wake affine
> > > which choses the LLC domain and secondly while searching for idle CPUs in
> > > LLC domain. In the first level it uses cpus_preferred to prune out the
> > > search space. In the second level it first searches the cpus_preferred and
> > > then cpus_allowed. Using affinity_unequal flag it breaks early to avoid
> > > any overhead in the scheduler fast path when soft affinity is not used.
> > > This only changes the wake up path of the scheduler, the idle balancing
> > > is unchanged; together they achieve the "softness" of scheduling.
> > I really dislike this implementation.
> > 
> > I thought the idea was to remain work conserving (in so far as that
> > we're that anyway), so changing select_idle_sibling() doesn't make sense
> > to me. If there is idle, we use it.
> > 
> > Same for newidle; which you already retained.
> The scheduler is already not work conserving in many ways. Soft affinity is
> only for those who want to use it and has no side effects when not used.
> Also the way scheduler is implemented in the first level of search it may
> not be possible to do it in a work conserving way, I am open to ideas.

I really don't understand the premise of this soft affinity stuff then.

I understood it was to allow spreading if under-utilized, but group when
over-utilized, but you're arguing for the exact opposite, which doesn't
make sense.

> > And I also really don't want a second utilization tipping point; we
> > already have the overloaded thing.
> The numbers in the cover letter show that a static tipping point will not
> work for all workloads. What soft affinity is doing is essentially trading
> off cache coherence for more CPU. The optimum tradeoff point will vary
> from workload to workload and the system metrics of coherence overhead etc.
> If we just use the domain overload that becomes a static definition of
> tipping point, we need something tunable that captures this tradeoff. The
> ratio of CPU util seemed to work well and capture that.

And then you run two workloads with different characteristics on the
same box.

Global knobs are buggered.