linux-kernel - Re: callchain sampling bug in perf?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100822004927.GC5258@nowhere>
Date:	Sun, 22 Aug 2010 02:49:29 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Christoph Hellwig <hch@...radead.org>
Cc:	Arnaldo Carvalho de Melo <acme@...radead.org>,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: callchain sampling bug in perf?

On Sat, Aug 21, 2010 at 10:42:39AM -0400, Christoph Hellwig wrote:
> It does seem to fix the bug for some cases but not all.  Default perf
> report in TUI and the normal command line seem to get it right.  perf
> report -g flat still shows the old problem.  perf report -g flat,0.0
> shows callgraphs, but just as before they just show the 0.<n>
> percentages.

Yep. So I just found the other problem.
We collect every events and store them into per tid histograms.
But depending on the final sorting (by default we sort by comm),
we may merge (collapse) the histograms against the sorting
criteria. If this is by comm, per tid histograms will become
per comm histograms, hence threads profiles will into process
profiles. We have callbacks that handle this merge, but we
forgot to handle callchains.

So imagine we have three threads (tids: 1000, 1001, 1002) that
belong to python.

tid 1000 got 100 events
tid 1001 got 10 events
tid 1002 got 3 events

Once we merge these histograms to get a per comm result, we'll
finally get:

python got 113 events

The problem is we merge 1000 and 1001 histograms into 1002. So the end
merge result, wrt callchains, will be only 1002 callchains. Because
we haven't handled callchains in the merge. Only those from one of
the threads survived.

So, I'm going to fix that.

> Btw, even in normal perf report mode the numbers while they look correct
> confused me a bit.  The percentages before the callgraphs split are
> always relative to the node above, not absolute which is rather
> confusing.  And even despite adding -n to the perf report command line
> I only get absolute numbers for the proccesses but not the actual
> callgraphs.

That's the point of the fractal mode. It's a relative profiling against
the parent node.

But you can select the "graph" mode that does an absolute profiling (against
the total hits).

> > > Also the flat mode is rendered incorrectly, it just adds different call
> > > graphs inside a single process directly after each other instead of
> > > separating them in the rendering.
> > 
> > 
> > I'm not sure what you mean. The flat format is a pure dump of every callchains
> > that belong to a single process (or whatever kind of histogram source...).
> > 
> > What do you mean by separating them in the rendering?
> 
> If there are different callchains leading to the same tracepoint they
> are just appened to each other with no visual indication that they are
> separate callchains

Ah right. There is a blank line between callchains. If that's confusing I
can add a kind of "----" boundary.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/