lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F1DA9D0.6090208@fb.com>
Date:	Mon, 23 Jan 2012 10:41:20 -0800
From:	Arun Sharma <asharma@...com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC:	<linux-kernel@...r.kernel.org>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Andrew Vagin <avagin@...nvz.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...e.hu>,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH] trace: reset sleep/block start time on task switch

On 1/23/12 3:34 AM, Peter Zijlstra wrote:

> Its not just your tracepoint data being wrong, it'll wreck all related
> stats :/

I see: these stats are also used in sched_debug.c.

>
> This'll fail to compile for !CONFIG_SCHEDSTAT I guess.. I should have
> paid more attention to the initial patch, that tracepoint having
> side-effects is a big no-no.
>
> Having unconditional writes there is somewhat sad, but I suspect putting
> a conditional around it isn't going to help much..

For performance reasons?


> bah can we
> restructure things so we don't need this?
>

We can go back to the old code, where these values were getting reset in 
{en,de}queue_sleeper(). But we'll have to do it conditionally, so the 
values are preserved till context switch time when we need it there.

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1003,6 +1003,8 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, 
struct sched_entity *se)
                 if (unlikely(delta > se->statistics.sleep_max))
                         se->statistics.sleep_max = delta;

+               if (!trace_sched_stat_sleeptime_enabled())
+                       se->statistics.sleep_start = 0;
                 se->statistics.sum_sleep_runtime += delta;

                 if (tsk) {
@@ -1019,6 +1021,8 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, 
struct sched_entity *se)
                 if (unlikely(delta > se->statistics.block_max))
                         se->statistics.block_max = delta;

+               if (!trace_sched_stat_sleeptime_enabled())
+                       se->statistics.block_start = 0;
                 se->statistics.sum_sleep_runtime += delta;

                 if (tsk) {

This looks pretty ugly too, I don't know how to check for a tracepoint 
being active (Steven?). The only advantage of this approach is that it's 
in the sleep/wakeup path, rather than the context switch path.

Conceptually, the following seems to be the simplest:

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1939,8 +1939,10 @@ static void finish_task_switch(struct rq *rq, 
struct task_struct *prev)
         finish_lock_switch(rq, prev);

         trace_sched_stat_sleeptime(current, rq->clock);
+#ifdef CONFIG_SCHEDSTAT
         current->se.statistics.block_start = 0;
         current->se.statistics.sleep_start = 0;
+#endif /* CONFIG_SCHEDSTAT */

Perhaps we can reorder fields in sched_statistics so we touch one 
cacheline here instead of two?

  -Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ