lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191030145133.GH3016@techsingularity.net>
Date:   Wed, 30 Oct 2019 14:51:33 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        peterz@...radead.org, pauld@...hat.com, valentin.schneider@....com,
        srikar@...ux.vnet.ibm.com, quentin.perret@....com,
        dietmar.eggemann@....com, Morten.Rasmussen@....com,
        hdanton@...a.com, parth@...ux.ibm.com, riel@...riel.com
Subject: Re: [PATCH v4 01/11] sched/fair: clean up asym packing

On Fri, Oct 18, 2019 at 03:26:28PM +0200, Vincent Guittot wrote:
> Clean up asym packing to follow the default load balance behavior:
> - classify the group by creating a group_asym_packing field.
> - calculate the imbalance in calculate_imbalance() instead of bypassing it.
> 
> We don't need to test twice same conditions anymore to detect asym packing
> and we consolidate the calculation of imbalance in calculate_imbalance().
> 
> There is no functional changes.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> Acked-by: Rik van Riel <riel@...riel.com>
> ---
>  kernel/sched/fair.c | 63 ++++++++++++++---------------------------------------
>  1 file changed, 16 insertions(+), 47 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1f0a5e1..617145c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7675,6 +7675,7 @@ struct sg_lb_stats {
>  	unsigned int group_weight;
>  	enum group_type group_type;
>  	int group_no_capacity;
> +	unsigned int group_asym_packing; /* Tasks should be moved to preferred CPU */
>  	unsigned long group_misfit_task_load; /* A CPU has a task too big for its capacity */
>  #ifdef CONFIG_NUMA_BALANCING
>  	unsigned int nr_numa_running;
> @@ -8129,9 +8130,17 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>  	 * ASYM_PACKING needs to move all the work to the highest
>  	 * prority CPUs in the group, therefore mark all groups
>  	 * of lower priority than ourself as busy.
> +	 *
> +	 * This is primarily intended to used at the sibling level.  Some
> +	 * cores like POWER7 prefer to use lower numbered SMT threads.  In the
> +	 * case of POWER7, it can move to lower SMT modes only when higher
> +	 * threads are idle.  When in lower SMT modes, the threads will
> +	 * perform better since they share less core resources.  Hence when we
> +	 * have idle threads, we want them to be the higher ones.
>  	 */
>  	if (sgs->sum_nr_running &&
>  	    sched_asym_prefer(env->dst_cpu, sg->asym_prefer_cpu)) {
> +		sgs->group_asym_packing = 1;
>  		if (!sds->busiest)
>  			return true;
>  

(I did not read any of the earlier implementations of this series, maybe
this was discussed already in which case, sorry for the noise)

Are you *sure* this is not a functional change?

Asym packing is a twisty maze of headaches and I'm not familiar enough
with it to be 100% certain without spending a lot of time on this. Even
spotting how Power7 ends up using asym packing with lower-numbered SMT
threads is a bit of a challenge.  Specifically, it relies on the scheduler
domain SD_ASYM_PACKING flag for SMT domains to use the weak implementation
of arch_asym_cpu_priority which by defaults favours the lower-numbered CPU.

The check_asym_packing implementation checks that flag but I can't see
the equiavlent type of check here. update_sd_pick_busiest  could be called
for domains that span NUMA or basically any domain that does not specify
SD_ASYM_PACKING and end up favouring a lower-numbered CPU (or whatever
arch_asym_cpu_priority returns in the case of x86 which has a different
idea for favoured CPUs).

sched_asym_prefer appears to be a function that is very easy to use
incorrectly. Should it take env and check the SD flags first?

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ