lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 2 Apr 2015 10:17:04 +0100
From:	Morten Rasmussen <morten.rasmussen@....com>
To:	Jason Low <jason.low2@...com>
Cc:	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	"peterz@...radead.org" <peterz@...radead.org>,
	"mingo@...nel.org" <mingo@...nel.org>,
	Daniel Lezcano <daniel.lezcano@...aro.org>,
	"riel@...hat.com" <riel@...hat.com>,
	"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
	"srikar@...ux.vnet.ibm.com" <srikar@...ux.vnet.ibm.com>,
	"pjt@...gle.com" <pjt@...gle.com>,
	"benh@...nel.crashing.org" <benh@...nel.crashing.org>,
	"efault@....de" <efault@....de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"iamjoonsoo.kim@....com" <iamjoonsoo.kim@....com>,
	"svaidy@...ux.vnet.ibm.com" <svaidy@...ux.vnet.ibm.com>,
	"tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>
Subject: Re: sched: Improve load balancing in the presence of idle CPUs

On Thu, Apr 02, 2015 at 06:59:07AM +0100, Jason Low wrote:
> On Wed, 2015-04-01 at 18:04 +0100, Morten Rasmussen wrote:
> > On Wed, Apr 01, 2015 at 07:49:56AM +0100, Preeti U Murthy wrote:
> 
> > > I am sorry I don't quite get this. Can you please elaborate?
> > 
> > I think the scenario is that we are in nohz_idle_balance() and decide to
> > bail out because we have pulled some tasks, but before leaving
> > nohz_idle_balance() we want to check if more balancing is necessary
> > using nohz_kick_needed() and potentially kick somebody to continue.
> 
> Also, below is an example patch.
> 
> (Without the conversion to idle_cpu(), the check for rq->idle_balance
> would not be accurate anymore)
> 
> ---
>  kernel/sched/fair.c |   17 ++++++++++-------
>  1 files changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fdae26e..7749a14 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7620,6 +7620,8 @@ out:
>  }
>  
>  #ifdef CONFIG_NO_HZ_COMMON
> +static inline bool nohz_kick_needed(struct rq *rq);
> +
>  /*
>   * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the
>   * rebalancing for all the cpus for whom scheduler ticks are stopped.
> @@ -7629,6 +7631,7 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
>  	int this_cpu = this_rq->cpu;
>  	struct rq *rq;
>  	int balance_cpu;
> +	bool done_balancing = false;
>  
>  	if (idle != CPU_IDLE ||
>  	    !test_bit(NOHZ_BALANCE_KICK, nohz_flags(this_cpu)))
> @@ -7644,7 +7647,7 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
>  		 * balancing owner will pick it up.
>  		 */
>  		if (need_resched())
> -			break;
> +			goto end;
>  
>  		rq = cpu_rq(balance_cpu);
>  
> @@ -7663,9 +7666,12 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
>  		if (time_after(this_rq->next_balance, rq->next_balance))
>  			this_rq->next_balance = rq->next_balance;
>  	}
> +	done_balancing = true;
>  	nohz.next_balance = this_rq->next_balance;
>  end:
>  	clear_bit(NOHZ_BALANCE_KICK, nohz_flags(this_cpu));
> +	if (!done_balancing && nohz_kick_needed(this_rq))
> +		nohz_balancer_kick();
>  }
>  
>  /*
> @@ -7687,7 +7693,7 @@ static inline bool nohz_kick_needed(struct rq *rq)
>  	int nr_busy, cpu = rq->cpu;
>  	bool kick = false;
>  
> -	if (unlikely(rq->idle_balance))
> +	if (unlikely(idle_cpu(cpu)))
>  		return false;
>  
>         /*
> @@ -7757,16 +7763,13 @@ static void run_rebalance_domains(struct softirq_action *h)
>  	enum cpu_idle_type idle = this_rq->idle_balance ?
>  						CPU_IDLE : CPU_NOT_IDLE;
>  
> +	rebalance_domains(this_rq, idle);
>  	/*
>  	 * If this cpu has a pending nohz_balance_kick, then do the
>  	 * balancing on behalf of the other idle cpus whose ticks are
> -	 * stopped. Do nohz_idle_balance *before* rebalance_domains to
> -	 * give the idle cpus a chance to load balance. Else we may
> -	 * load balance only within the local sched_domain hierarchy
> -	 * and abort nohz_idle_balance altogether if we pull some load.
> +	 * stopped.
>  	 */
>  	nohz_idle_balance(this_rq, idle);
> -	rebalance_domains(this_rq, idle);
>  }

I think this should reduce the latency Preeti is seeing and avoid
unnecessary wake-ups, however, it may not be quite as aggressive in
spreading tasks quickly. It will stop the chain-of-kicks as soon as the
balancer cpu has pulled only one task. The source cpu may still be
having two tasks and other cpus may still have more than two tasks
running.

Depending on how bad it is, we could consider kicking another cpu if the
imbalance is still significant after the balancer cpu has pulled a task.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ