linux-kernel - Re: [RFC patch 1/2] sched: dynamically adapt granularity with nr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1284383758.2275.283.camel@laptop>
Date:	Mon, 13 Sep 2010 15:15:58 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tony Lindgren <tony@...mide.com>,
	Mike Galbraith <efault@....de>
Subject: Re: [RFC patch 1/2] sched: dynamically adapt granularity with
 nr_running

On Mon, 2010-09-13 at 14:53 +0200, Peter Zijlstra wrote:
> On Sun, 2010-09-12 at 16:37 -0400, Mathieu Desnoyers wrote:
> > The whole point of my patch is not to have to do this latency vs performance
> > tradeoff for low number of running threads. With your approach, lowering the
> > granularity even when there are few threads running will very likely hurt
> > performance, no ? 
> 
> But you presented it as a latency patch, not a throughput patch. And I'm
> not sure it will matter enough to offset the computational cost it
> introduces.


---
On Mon, 2010-09-13 at 14:53 +0200, Peter Zijlstra wrote:
On Sun, 2010-09-12 at 16:37 -0400, Mathieu Desnoyers wrote:
> > The whole point of my patch is not to have to do this latency vs performance
> > tradeoff for low number of running threads. With your approach, lowering the
> > granularity even when there are few threads running will very likely hurt
> > performance, no ? 
> 
> But you presented it as a latency patch, not a throughput patch. And I'm
> not sure it will matter enough to offset the computational cost it
> introduces.
> 

One option is to simply get rid of that stuff in check_preempt_tick()
and instead do a wakeup-preempt check on the leftmost task instead.

The code as it stands today does that delta_exec < min_gran check to
ensure current gets some runtime before doing that second preemption
check, which compares vruntime with a wall-time measure.

Making that gran more complex doesn't really buy us much because for a
system with different weights in the gran and slice lengths don't match
up anyway.

---
Subject: sched: Simplify tick preemption
From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Date: Mon Jul 05 13:56:30 CEST 2010

Check the current slice, if not expired, see if the leftmost task
would otherwise have preempted current.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
---
 kernel/sched_fair.c |   43 +++++++++++++++----------------------------
 1 file changed, 15 insertions(+), 28 deletions(-)

Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -838,44 +838,34 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
 		se->vruntime -= cfs_rq->min_vruntime;
 }
 
+static int
+wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
+
 /*
  * Preempt the current task with a newly woken task if needed:
  */
 static void
 check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
 {
-	unsigned long ideal_runtime, delta_exec;
+	unsigned long slice = sched_slice(cfs_rq, curr);
+
+	if (curr->sum_exec_runtime - curr->prev_sum_exec_runtime < slice) {
+		struct sched_entity *pse = __pick_next_entity(cfs_rq);
+
+		if (pse && wakeup_preempt_entity(curr, pse) == 1)
+			goto preempt;
 
-	ideal_runtime = sched_slice(cfs_rq, curr);
-	delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
-	if (delta_exec > ideal_runtime) {
-		resched_task(rq_of(cfs_rq)->curr);
-		/*
-		 * The current task ran long enough, ensure it doesn't get
-		 * re-elected due to buddy favours.
-		 */
-		clear_buddies(cfs_rq, curr);
 		return;
 	}
 
 	/*
-	 * Ensure that a task that missed wakeup preemption by a
-	 * narrow margin doesn't have to wait for a full slice.
-	 * This also mitigates buddy induced latencies under load.
+	 * The current task ran long enough, ensure it doesn't get
+	 * re-elected due to buddy favours.
 	 */
-	if (!sched_feat(WAKEUP_PREEMPT))
-		return;
-
-	if (delta_exec < sysctl_sched_min_granularity)
-		return;
+	clear_buddies(cfs_rq, curr);
 
-	if (cfs_rq->nr_running > 1) {
-		struct sched_entity *se = __pick_next_entity(cfs_rq);
-		s64 delta = curr->vruntime - se->vruntime;
-
-		if (delta > ideal_runtime)
-			resched_task(rq_of(cfs_rq)->curr);
-	}
+preempt:
+	resched_task(rq_of(cfs_rq)->curr);
 }
 
 static void
@@ -908,9 +898,6 @@ set_next_entity(struct cfs_rq *cfs_rq, s
 	se->prev_sum_exec_runtime = se->sum_exec_runtime;
 }
 
-static int
-wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
-
 static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
 {
 	struct sched_entity *se = __pick_next_entity(cfs_rq);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/