lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Jul 2014 16:30:05 +0200
From:	Peter Zijlstra <>
To:	Rik van Riel <>
Cc:	Vincent Guittot <>,
	linux-kernel <>,
	Michael Neuling <>,
	Ingo Molnar <>, Paul Turner <>,,,,
	Nicolas Pitre <>
Subject: Re: [PATCH v2] sched: make update_sd_pick_busiest return true on a
 busier sd

On Fri, Jul 25, 2014 at 03:32:10PM -0400, Rik van Riel wrote:
> Subject: sched: make update_sd_pick_busiest return true on a busier sd
> Currently update_sd_pick_busiest only identifies the busiest sd
> that is either overloaded, or has a group imbalance. When no
> sd is imbalanced or overloaded, the load balancer fails to find
> the busiest domain.
> This breaks load balancing between domains that are not overloaded,
> in the !SD_ASYM_PACKING case. This patch makes update_sd_pick_busiest
> return true when the busiest sd yet is encountered.
> Behaviour for SD_ASYM_PACKING does not seem to match the comment,
> but I have no hardware to test that so I have left the behaviour
> of that code unchanged.
> It is unclear what to do with the group_imb condition.
> Should group_imb override a busier load? If so, should we fix
> calculate_imbalance to return a sensible number when the "busiest"
> node found has a below average load? We probably need to fix
> calculate_imbalance anyway, to deal with an overloaded group that
> happens to have a below average load...

I think we want overloaded > group_imb > other. So prefer overloaded
groups, imbalanced groups if no overloaded and anything else if no
overloaded and imbalanced thingies.

If you look at the comment near sg_imbalanced(), in that case we want to
move tasks from the first group to the second, even though the second
group would be the heaviest.

enum group_type {
	group_other = 0,

static enum group_type group_classify(struct sg_lb_stats *sgs)
	if (sgs->sum_nr_running > sgs->group_capacity_factor)
		return group_overloaded;

	if (sgs->group_imb)
		return group_imbalanced;

	return group_other;

>  /**
>   * update_sd_pick_busiest - return 1 on busiest group
>   * @env: The load balancing environment.
> @@ -5957,7 +5962,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>   * @sgs: sched_group statistics
>   *
>   * Determine if @sg is a busier group than the previously selected
> - * busiest group.
> + * busiest group. 

We really need that extra trailing whitespace, yes? ;-)

>   * Return: %true if @sg is a busier group than the previously selected
>   * busiest group. %false otherwise.
> @@ -5967,13 +5972,17 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>  				   struct sched_group *sg,
>  				   struct sg_lb_stats *sgs)
>  {

	if (group_classify(sgs) < group_classify(&sds->busiest_stats))
		return false;

>  	if (sgs->avg_load <= sds->busiest_stat.avg_load)
>  		return false;

> +	/* This is the busiest node. */
> +	if (!(env->sd->flags & SD_ASYM_PACKING))
>  		return true;
>  	/*

We could replace sg_lb_stats::group_imb with the above and avoid the
endless recomputation I suppose.

Also, we still need a little change to calculate_imbalance() where we
assume sum_nr_running > group_capacity_factor.

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists