[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F1DA9D0.6090208@fb.com>
Date: Mon, 23 Jan 2012 10:41:20 -0800
From: Arun Sharma <asharma@...com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC: <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Andrew Vagin <avagin@...nvz.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Ingo Molnar <mingo@...e.hu>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH] trace: reset sleep/block start time on task switch
On 1/23/12 3:34 AM, Peter Zijlstra wrote:
> Its not just your tracepoint data being wrong, it'll wreck all related
> stats :/
I see: these stats are also used in sched_debug.c.
>
> This'll fail to compile for !CONFIG_SCHEDSTAT I guess.. I should have
> paid more attention to the initial patch, that tracepoint having
> side-effects is a big no-no.
>
> Having unconditional writes there is somewhat sad, but I suspect putting
> a conditional around it isn't going to help much..
For performance reasons?
> bah can we
> restructure things so we don't need this?
>
We can go back to the old code, where these values were getting reset in
{en,de}queue_sleeper(). But we'll have to do it conditionally, so the
values are preserved till context switch time when we need it there.
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1003,6 +1003,8 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq,
struct sched_entity *se)
if (unlikely(delta > se->statistics.sleep_max))
se->statistics.sleep_max = delta;
+ if (!trace_sched_stat_sleeptime_enabled())
+ se->statistics.sleep_start = 0;
se->statistics.sum_sleep_runtime += delta;
if (tsk) {
@@ -1019,6 +1021,8 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq,
struct sched_entity *se)
if (unlikely(delta > se->statistics.block_max))
se->statistics.block_max = delta;
+ if (!trace_sched_stat_sleeptime_enabled())
+ se->statistics.block_start = 0;
se->statistics.sum_sleep_runtime += delta;
if (tsk) {
This looks pretty ugly too, I don't know how to check for a tracepoint
being active (Steven?). The only advantage of this approach is that it's
in the sleep/wakeup path, rather than the context switch path.
Conceptually, the following seems to be the simplest:
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1939,8 +1939,10 @@ static void finish_task_switch(struct rq *rq,
struct task_struct *prev)
finish_lock_switch(rq, prev);
trace_sched_stat_sleeptime(current, rq->clock);
+#ifdef CONFIG_SCHEDSTAT
current->se.statistics.block_start = 0;
current->se.statistics.sleep_start = 0;
+#endif /* CONFIG_SCHEDSTAT */
Perhaps we can reorder fields in sched_statistics so we touch one
cacheline here instead of two?
-Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists