linux-kernel - Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for BALANCE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 03 Jul 2015 11:29:07 +0200
From:	Mike Galbraith <umgwanakikbuti@...il.com>
To:	Josef Bacik <jbacik@...com>
Cc:	Peter Zijlstra <peterz@...radead.org>, riel@...hat.com,
	mingo@...hat.com, linux-kernel@...r.kernel.org,
	morten.rasmussen@....com, kernel-team <Kernel-team@...com>
Subject: Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for
 BALANCE_WAKE

On Fri, 2015-07-03 at 08:40 +0200, Mike Galbraith wrote:

> Hm.  Seems what this load should like best is if we detect 1:N, skip all
> of the routine gyrations, ie move the N (workers) infrequently, expend
> search cycles frequently only on the 1 (dispatch).
> 
> Ponder..

While taking a refresher peek at the wake_wide() thing, seems it's not
really paying attention when the waker of many is awakened.  I wonder if
your load would see more benefit if it watched like so.. rashly assuming
I didn't wreck it completely (iow, completely untested).

---
 kernel/sched/fair.c |   36 ++++++++++++++++++++++--------------
 1 file changed, 22 insertions(+), 14 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4586,10 +4586,23 @@ static void record_wakee(struct task_str
 		current->wakee_flips >>= 1;
 		current->wakee_flip_decay_ts = jiffies;
 	}
+	if (time_after(jiffies, p->wakee_flip_decay_ts + HZ)) {
+		p->wakee_flips >>= 1;
+		p->wakee_flip_decay_ts = jiffies;
+	}
 
 	if (current->last_wakee != p) {
 		current->last_wakee = p;
 		current->wakee_flips++;
+		/*
+		 * Flip the buddy as well.  It's the ratio of flips
+		 * with a socket size decayed cutoff that determines
+		 * whether the pair are considered to be part of 1:N
+		 * or M*N loads of a size that we need to spread, so
+		 * ensure flips of both load components.  The waker
+		 * of many will have many more flips than its wakees.
+		 */
+		p->wakee_flips++;
 	}
 }
 
@@ -4732,24 +4745,19 @@ static long effective_load(struct task_g
 
 static int wake_wide(struct task_struct *p)
 {
+	unsigned long max = max(current->wakee_flips, p->wakee_flips);
+	unsigned long min = min(current->wakee_flips, p->wakee_flips);
 	int factor = this_cpu_read(sd_llc_size);
 
 	/*
-	 * Yeah, it's the switching-frequency, could means many wakee or
-	 * rapidly switch, use factor here will just help to automatically
-	 * adjust the loose-degree, so bigger node will lead to more pull.
+	 * Yeah, it's a switching-frequency heuristic, and could mean the
+	 * intended many wakees/waker relationship, or rapidly switching
+	 * between a few.  Use factor to try to automatically adjust such
+	 * that the load spreads when it grows beyond what will fit in llc.
 	 */
-	if (p->wakee_flips > factor) {
-		/*
-		 * wakee is somewhat hot, it needs certain amount of cpu
-		 * resource, so if waker is far more hot, prefer to leave
-		 * it alone.
-		 */
-		if (current->wakee_flips > (factor * p->wakee_flips))
-			return 1;
-	}
-
-	return 0;
+	if (min < factor)
+		return 0;
+	return max > min * factor;
 }
 
 static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/