lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1398292317.2970.63.camel@schen9-DESK>
Date:	Wed, 23 Apr 2014 15:31:57 -0700
From:	Tim Chen <tim.c.chen@...ux.intel.com>
To:	Ingo Molnar <mingo@...e.hu>, Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>,
	Len Brown <len.brown@...el.com>
Subject: [PATCH] sched: Skip double execution of pick_next_task_fair

The current code will call pick_next_task_fair a second time
in the slow path if we did not pull any task in our first try.
This is really unnecessary as we already know no task can
be pulled and it doubles the delay for the cpu to enter idle.

We instrumented some network workloads and that saw that
pick_next_task_fair is frequently called twice before a cpu enters idle.
The call to pick_next_task_fair can add
non trivial latency as it calls load_balance which runs find_busiest_group
on an hierarchy of sched domains spanning the cpus for a large system.  For
some 4 socket systems, we saw almost 0.25 msec spent per call
of pick_next_task_fair before a cpu can be idled.

This patch skips pick_next_task_fair in the slow path if it
has already been invoked.

Tim

Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
---
 kernel/sched/core.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1d1b87b..4053437 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2583,6 +2583,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev)
 {
 	const struct sched_class *class = &fair_sched_class;
 	struct task_struct *p;
+	int skip_fair = 0;
 
 	/*
 	 * Optimization: we know that if all tasks are in
@@ -2591,12 +2592,17 @@ pick_next_task(struct rq *rq, struct task_struct *prev)
 	if (likely(prev->sched_class == class &&
 		   rq->nr_running == rq->cfs.h_nr_running)) {
 		p = fair_sched_class.pick_next_task(rq, prev);
-		if (likely(p && p != RETRY_TASK))
+		if (!p)
+			skip_fair = 1;
+		else if (likely(p != RETRY_TASK))
 			return p;
 	}
 
 again:
 	for_each_class(class) {
+		/* if we have already failed to pull fair task, skip */
+		if (class == &fair_sched_class && skip_fair)
+			continue;
 		p = class->pick_next_task(rq, prev);
 		if (p) {
 			if (unlikely(p == RETRY_TASK))
-- 
1.7.11.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ