lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191219084134.GH3178@techsingularity.net>
Date:   Thu, 19 Dec 2019 08:41:34 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Rik van Riel <riel@...riel.com>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>, pauld@...hat.com,
        valentin.schneider@....com, srikar@...ux.vnet.ibm.com,
        quentin.perret@....com, dietmar.eggemann@....com,
        Morten.Rasmussen@....com, hdanton@...a.com, parth@...ux.ibm.com,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched, fair: Allow a small degree of load imbalance
 between SD_NUMA domains

On Wed, Dec 18, 2019 at 09:58:01PM -0500, Rik van Riel wrote:
> On Wed, 2019-12-18 at 15:44 +0000, Mel Gorman wrote:
> 
> > +			/*
> > +			 * Ignore imbalance unless busiest sd is close
> > to 50%
> > +			 * utilisation. At that point balancing for
> > memory
> > +			 * bandwidth and potentially avoiding
> > unnecessary use
> > +			 * of HT siblings is as relevant as memory
> > locality.
> > +			 */
> > +			imbalance_max = (busiest->group_weight >> 1) -
> > imbalance_adj;
> > +			if (env->imbalance <= imbalance_adj &&
> > +			    busiest->sum_nr_running < imbalance_max) {
> > +				env->imbalance = 0;
> > +			}
> > +		}
> >  		return;
> >  	}
> 
> I can see how the 50% point is often great for HT,
> but I wonder if that is also the case for SMT4 and
> SMT8 systems...
> 

Maybe, maybe not but it's not the most important concern. The highlight
in the comment was about memory bandwidth and HT was simply an additional
concern. Ideally memory bandwidth and consumption would be taken into
account but we know nothing about either. Even if peak memory bandwidth
was known, the reference pattern matters a *lot* which can be readily
illustrated by using STREAM and observing the different bandwidths for
different reference patterns. Similarly, while we might know pages that
were referenced, we do not know the bandwidth consumption without taking
additional overhead with a PMU. Hence, it makes sense to at least hope
that the active tasks have similar memory bandwidth requirements and load
balance as normal when we are near the 50% active tasks/busy CPUs. If
SMT4 or SMT8 have different requirements or it matters for memory
bandwidth then it would need to be carefully examined by someone with
access to such hardware to determine an arch-specific and maybe even a
per-CPU-family cutoff.

In the context of this patch, it unconditionally makes sense that the
basic case of two communicating tasks are not migrating cross-node on
wakeup and then again on load balance.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ