[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141215172129.GS3337@twins.programming.kicks-ass.net>
Date: Mon, 15 Dec 2014 18:21:29 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Josef Bacik <jbacik@...com>
Cc: bmaurer@...com, rkroll@...com, kernel-team@...com,
mingo@...hat.com, linux-kernel@...r.kernel.org,
umgwanakikbuti@...il.com
Subject: Re: [PATCH] sched/fair: change where we report sched stats V2
On Mon, Dec 15, 2014 at 10:37:09AM -0500, Josef Bacik wrote:
> >Yeah, so I don't like this, it adds overhead for everyone.
> >
>
> Only if SCHEDSTATS is enabled tho, and it's no more overhead in the
> SCHEDSTATS case than before. Would it be more acceptable to move the entire
> callback under SCHEDSTATS?
Nah, doesn't work. Distros need to enable the world and then some so
.config is a false choice.
> This is fine for discrete problems, but when trying to find a random latency
> spike in a production workload it's impossible. If I do
>
> trace-cmd record -e sched:sched_switch -T sleep 5
>
> on just one of our random web servers I end up with this
>
> du -h trace.dat
> 62M trace.dat
>
> thats 62 megs in 5 seconds. I ran the following command for almost 2 hours
> when searching for a latency spike
>
> trace-cmd record -B latency -e sched:sched_stat_blocked -f \"delay >=
> 100000\" -T -o /root/latency.dat
>
> and got the following .dat file
>
> du -h latency.dat
> 48M latency.dat
Ah, regardless what I think of our filter implementation, that actually
makes sense, let me ponder this a bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists