[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tip-911b2898b3c9fe0048e9485ad1629ed4fce330fd@git.kernel.org>
Date: Wed, 13 Nov 2013 09:25:41 -0800
From: tip-bot for Peter Zijlstra <tipbot@...or.com>
To: linux-tip-commits@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, hpa@...or.com, mingo@...nel.org,
torvalds@...ux-foundation.org, pjt@...gle.com,
peterz@...radead.org, akpm@...ux-foundation.org,
tglx@...utronix.de, kosaki.motohiro@...fujitsu.com,
lwoodman@...hat.com
Subject: [tip:sched/urgent] sched: Optimize task_sched_runtime()
Commit-ID: 911b2898b3c9fe0048e9485ad1629ed4fce330fd
Gitweb: http://git.kernel.org/tip/911b2898b3c9fe0048e9485ad1629ed4fce330fd
Author: Peter Zijlstra <peterz@...radead.org>
AuthorDate: Mon, 11 Nov 2013 18:21:56 +0100
Committer: Ingo Molnar <mingo@...nel.org>
CommitDate: Wed, 13 Nov 2013 13:33:54 +0100
sched: Optimize task_sched_runtime()
Large multi-threaded apps like to hit this using do_sys_times() and
then queue up on the rq->lock.
Avoid when possible.
Larry reported ~20% performance increase his test case.
Reported-by: Larry Woodman <lwoodman@...hat.com>
Suggested-by: Paul Turner <pjt@...gle.com>
Signed-off-by: Peter Zijlstra <peterz@...radead.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Link: http://lkml.kernel.org/r/20131111172925.GG26898@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
kernel/sched/core.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1deccd7..c180860 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2253,6 +2253,20 @@ unsigned long long task_sched_runtime(struct task_struct *p)
struct rq *rq;
u64 ns = 0;
+#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
+ /*
+ * 64-bit doesn't need locks to atomically read a 64bit value.
+ * So we have a optimization chance when the task's delta_exec is 0.
+ * Reading ->on_cpu is racy, but this is ok.
+ *
+ * If we race with it leaving cpu, we'll take a lock. So we're correct.
+ * If we race with it entering cpu, unaccounted time is 0. This is
+ * indistinguishable from the read occurring a few cycles earlier.
+ */
+ if (!p->on_cpu)
+ return p->se.sum_exec_runtime;
+#endif
+
rq = task_rq_lock(p, &flags);
ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
task_rq_unlock(rq, p, &flags);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists