lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070619193903.GA15024@elte.hu>
Date:	Tue, 19 Jun 2007 21:39:03 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org, Greg KH <gregkh@...e.de>,
	chrisw@...s-sol.org, stable@...nel.org,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Christoph Lameter <clameter@....com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: [patch] sched: fix next_interval determination in idle_balance()


2.6.22 must-have item - perhaps suitable for -stable too, because it was 
reproduced on 2.6.21.5 too.

---------------------->
From: Christoph Lameter <clameter@....com>
Subject: [patch] sched: fix next_interval determination in idle_balance()

Fix massive SMP imbalance on NUMA nodes observed on 2.6.21.5 with CFS. 
(and later on reproduced without CFS as well).

The intervals of domains that do not have SD_BALANCE_NEWIDLE must be 
considered for the calculation of the time of the next balance. 
Otherwise we may defer rebalancing forever and nodes might stay idle for 
very long times.

Siddha also spotted that the conversion of the balance interval to 
jiffies is missing. Fix that to.

From: Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>

also continue the loop if !(sd->flags & SD_LOAD_BALANCE).

Tested-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>

It did in fact trigger under all three of mainline, CFS, and -rt 
including CFS -- see below for a couple of emails from last Friday 
giving results for these three on the AMD box (where it happened) and on 
a single-quad NUMA-Q system (where it did not, at least not with such 
severity).

Signed-off-by: Christoph Lameter <clameter@....com>
Signed-off-by: Ingo Molnar <mingo@...e.hu>
---
 kernel/sched.c |   22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

Index: v/kernel/sched.c
===================================================================
--- v.orig/kernel/sched.c
+++ v/kernel/sched.c
@@ -2938,17 +2938,21 @@ static void idle_balance(int this_cpu, s
 	unsigned long next_balance = jiffies + 60 *  HZ;
 
 	for_each_domain(this_cpu, sd) {
-		if (sd->flags & SD_BALANCE_NEWIDLE) {
+		unsigned long interval;
+
+		if (!(sd->flags & SD_LOAD_BALANCE))
+			continue;
+
+		if (sd->flags & SD_BALANCE_NEWIDLE)
 			/* If we've pulled tasks over stop searching: */
 			pulled_task = load_balance_newidle(this_cpu,
-							this_rq, sd);
-			if (time_after(next_balance,
-				  sd->last_balance + sd->balance_interval))
-				next_balance = sd->last_balance
-						+ sd->balance_interval;
-			if (pulled_task)
-				break;
-		}
+								this_rq, sd);
+
+		interval = msecs_to_jiffies(sd->balance_interval);
+		if (time_after(next_balance, sd->last_balance + interval))
+			next_balance = sd->last_balance + interval;
+		if (pulled_task)
+			break;
 	}
 	if (!pulled_task)
 		/*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ