lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Nov 2018 12:33:45 -0500
From:   Steven Sistare <steven.sistare@...cle.com>
To:     Valentin Schneider <valentin.schneider@....com>, mingo@...hat.com,
        peterz@...radead.org
Cc:     subhra.mazumdar@...cle.com, dhaval.giani@...cle.com,
        daniel.m.jordan@...cle.com, pavel.tatashin@...rosoft.com,
        matt@...eblueprint.co.uk, umgwanakikbuti@...il.com,
        riel@...hat.com, jbacik@...com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, quentin.perret@....com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 03/10] sched/topology: Provide cfs_overload_cpus bitmap

On 11/12/2018 11:42 AM, Valentin Schneider wrote:
> Hi Steve,
> 
> On 09/11/2018 12:50, Steve Sistare wrote:
>> From: Steve Sistare <steve.sistare@...cle.com>
>>
>> Define and initialize a sparse bitmap of overloaded CPUs, per
>> last-level-cache scheduling domain, for use by the CFS scheduling class.
>> Save a pointer to cfs_overload_cpus in the rq for efficient access.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@...cle.com>
>> ---
>>  include/linux/sched/topology.h |  1 +
>>  kernel/sched/sched.h           |  2 ++
>>  kernel/sched/topology.c        | 21 +++++++++++++++++++--
>>  3 files changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
>> index 6b99761..b173a77 100644
>> --- a/include/linux/sched/topology.h
>> +++ b/include/linux/sched/topology.h
>> @@ -72,6 +72,7 @@ struct sched_domain_shared {
>>  	atomic_t	ref;
>>  	atomic_t	nr_busy_cpus;
>>  	int		has_idle_cores;
>> +	struct sparsemask *cfs_overload_cpus;
> 
> Thinking about misfit stealing, we can't use the sd_llc_shared's because
> on big.LITTLE misfit migrations happen across LLC domains.
> 
> I was thinking of adding a misfit sparsemask to the root_domain, but
> then I thought we could do the same thing for cfs_overload_cpus.
> 
> By doing so we'd have a single source of information for overloaded CPUs,
> and we could filter that down during idle balance - you mentioned earlier
> wanting to try stealing at each SD level. This would also let you get
> rid of [PATCH 02].
> 
> The main part of try_steal() could then be written down as something like
> this:
> 
> ----->8-----
> 
> for_each_domain(this_cpu, sd) {
> 	span = sched_domain_span(sd)
> 		
> 	for_each_sparse_wrap(src_cpu, overload_cpus) {
> 		if (cpumask_test_cpu(src_cpu, span) &&
> 		    steal_from(dts_rq, dst_rf, &locked, src_cpu)) {
> 			stolen = 1;
> 			goto out;
> 		}
> 	}
> }
> 
> ------8<-----
> 
> We could limit the stealing to stop at the highest SD_SHARE_PKG_RESOURCES
> domain for now so there would be no behavioural change - but we'd
> factorize the #ifdef SCHED_SMT bit. Furthermore, the door would be open
> to further stealing.
> 
> What do you think?

That is not efficient for a multi-level search because at each domain level we 
would (re) iterate over overloaded candidates that do not belong in that level.
To extend stealing across LLC, I would like to keep the per-LLC sparsemask, 
but add to each SD a list of sparsemask pointers.  The list nodes would be
private, but the sparsemask structs would be shared.  Each list would include
the masks that overlap the SD's members.  The list would be a singleton at the
core and LLC levels (same as the socket level for most processors), and would 
have multiple elements at the NUMA level.

- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ