lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20141215173016.GN10476@twins.programming.kicks-ass.net>
Date:	Mon, 15 Dec 2014 18:30:16 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Josef Bacik <jbacik@...com>
Cc:	bmaurer@...com, rkroll@...com, kernel-team@...com,
	mingo@...hat.com, linux-kernel@...r.kernel.org,
	umgwanakikbuti@...il.com, avagin@...nvz.org, rostedt@...dmis.org
Subject: Re: [PATCH] sched/fair: change where we report sched stats V2

On Mon, Dec 15, 2014 at 06:21:29PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 15, 2014 at 10:37:09AM -0500, Josef Bacik wrote:
> 
> > >Yeah, so I don't like this, it adds overhead for everyone.
> > >
> > 
> > Only if SCHEDSTATS is enabled tho, and it's no more overhead in the
> > SCHEDSTATS case than before.  Would it be more acceptable to move the entire
> > callback under SCHEDSTATS?
> 
> Nah, doesn't work. Distros need to enable the world and then some so
> .config is a false choice.
> 
> > This is fine for discrete problems, but when trying to find a random latency
> > spike in a production workload it's impossible. If I do
> > 
> > trace-cmd record -e sched:sched_switch -T sleep 5
> > 
> > on just one of our random web servers I end up with this
> > 
> > du -h trace.dat
> > 62M     trace.dat
> > 
> > thats 62 megs in 5 seconds.  I ran the following command for almost 2 hours
> > when searching for a latency spike
> > 
> > trace-cmd record -B latency -e sched:sched_stat_blocked -f \"delay >=
> > 100000\" -T -o /root/latency.dat
> > 
> > and got the following .dat file
> > 
> > du -h latency.dat
> > 48M     latency.dat
> 
> Ah, regardless what I think of our filter implementation, that actually
> makes sense, let me ponder this a bit.

Oh, I just remembered we 'fixed' this for perf, see commit:

  e6dab5ffab59 ("perf/trace: Add ability to set a target task for events")

I'm not sure how to do the same thing with ftrace though, maybe steve
knows.

The thing is, at wakeup time we know the task we're waking, so we pass
that task along and provide a trace for that instead of current. Andrew
(who implemented it might have some userspace to share).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ