[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160125133944.GE3162@techsingularity.net>
Date: Mon, 25 Jan 2016 13:39:44 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...nel.org>,
Matt Fleming <matt@...eblueprint.co.uk>,
Mike Galbraith <mgalbraith@...e.de>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: Make schedstats a runtime tunable that is
disabled by default
On Mon, Jan 25, 2016 at 12:26:06PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 25, 2016 at 10:05:31AM +0000, Mel Gorman wrote:
> > schedstats is very useful during debugging and performance tuning but it
> > incurs overhead. As such, even though it can be disabled at build time,
> > it is often enabled as the information is useful. This patch adds a
> > kernel command-line and sysctl tunable to enable or disable schedstats on
> > demand. It is disabled by default as someone who knows they need it can
> > also learn to enable it when necessary.
>
> So the reason its often enabled in distro configs is (IIRC) that it
> enables trace_sched_stat_{wait,sleep,iowait,blocked}().
>
> I've not looked at the details of this patch, but I suspect this patch
> would make these tracepoints available but non-functional unless you
> poke the magic button.
>
It's potentially slightly worse than that. The tracepoints are available,
functional but produce garbage unless the magic button is poked and do
a lot of useful work producing that garbage. I missed a few hunks that
are included below. With this, the tracepoints will exist but unless the
magic button is poked, they'll never fire. Considering the paths
affected, this will require retesting but if it's ok, would you be ok in
general with a patch like this that forces a button to be pushed if
the user is doing performance analysis?
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0fce4e353f3c..b39f2ff13345 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -755,7 +755,12 @@ static void
update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
struct task_struct *p;
- u64 delta = rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start;
+ u64 delta;
+
+ if (static_branch_unlikely(&sched_schedstats))
+ return;
+
+ delta = rq_clock(rq_of(cfs_rq)) - se->statistics.wait_start;
if (entity_is_task(se)) {
p = task_of(se);
@@ -2982,6 +2987,9 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
#ifdef CONFIG_SCHEDSTATS
struct task_struct *tsk = NULL;
+ if (static_branch_unlikely(&sched_schedstats))
+ return;
+
if (entity_is_task(se))
tsk = task_of(se);
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists