linux-kernel - Re: [PATCH 34/37] perf hists browser: Support flat callchains

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151125012608.GA6171@sejong>
Date:	Wed, 25 Nov 2015 10:26:08 +0900
From:	Namhyung Kim <namhyung@...nel.org>
To:	Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>
CC:	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
	Andi Kleen <andi@...stfloor.org>,
	David Ahern <dsahern@...il.com>, Jiri Olsa <jolsa@...hat.com>,
	Kan Liang <kan.liang@...el.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH 34/37] perf hists browser: Support flat callchains

Hi Arnaldo,

On Tue, Nov 24, 2015 at 12:45:51PM -0200, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 24, 2015 at 02:27:08PM +0900, Namhyung Kim escreveu:
> > On Mon, Nov 23, 2015 at 04:16:48PM +0100, Frederic Weisbecker wrote:
> > > On Thu, Nov 19, 2015 at 02:53:20PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > From: Namhyung Kim <namhyung@...nel.org>
> > > [...]
> > > 
> > > > +int callchain_node__make_parent_list(struct callchain_node *node)
> > > > +{
> > > > +	struct callchain_node *parent = node->parent;
> > > > +	struct callchain_list *chain, *new;
> > > > +	LIST_HEAD(head);
> > > > +
> > > > +	while (parent) {
> > > > +		list_for_each_entry_reverse(chain, &parent->val, list) {
> > > > +			new = malloc(sizeof(*new));
> > > > +			if (new == NULL)
> > > > +				goto out;
> > > > +			*new = *chain;
> > > > +			new->has_children = false;
> > > > +			list_add_tail(&new->list, &head);
> > > > +		}
> > > > +		parent = parent->parent;
> > > > +	}
> > > > +
> > > > +	list_for_each_entry_safe_reverse(chain, new, &head, list)
> > > > +		list_move_tail(&chain->list, &node->parent_val);
> > > > +
> > > > +	if (!list_empty(&node->parent_val)) {
> > > > +		chain = list_first_entry(&node->parent_val, struct callchain_list, list);
> > > > +		chain->has_children = rb_prev(&node->rb_node) || rb_next(&node->rb_node);
> > > > +
> > > > +		chain = list_first_entry(&node->val, struct callchain_list, list);
> > > > +		chain->has_children = false;
> > > 
> > > I'm a bit puzzled with this, can't we rewind through the parents on printing or adding
> > > to the flat rbtree instead of having this parent_val field?
> > 
> > Yes, this code is to simplify things on parent nodes.  Maybe we could
> > go up to parents and print the callchain list there as you said.
> > 
> > However, problem I think is how to handle 'has_children' information
> > on parents.  That info controls folding status of each callchain.  As
> > the info is in the struct callchain_list and flat or folded callchain
> > mode require the info should be in the top-most entry, I cannot share
> > entries in parent nodes.
> > 
> > Thus I simply copied callchain lists in parents to leaf nodes.  Yes,
> > it will consume some memory but can simplify the code.
> 
> I haven't done any measuring, but I'm noticing that 'perf top -g' is
> showing more warnings about not being able to process events fast enough
> and so ends up losing events, I tried with --max-stack 16 and it helped,
> this is just a heads up.

OK, but it seems that it's not related to this patch since this patch
only affects flat or folded callchain mode.

> 
> Perhaps my workstation workloads are gettning deeper callchains over
> time, but perhaps this is the cost of processing callchains that is
> increasing, I need to stop and try to quantify this.
> 
> We really need to look at reducing the overhead of processing
> callchains.

Right, but with my multi-thread work, I realized that perf is getting
heavier recently.  I guess it's mostly due to the atomic refcount
work.  I need to get back to the multi-thread work..

Anyway I made a initial multi-thread support for perf top too.  I
think I posted it to the list, but I cannot find the link.  You can
take a look at it on 'perf/top-threaded-v1' branch in my tree.

  git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git



Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/