lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 May 2017 12:37:01 +0200
From:   Milian Wolff <milian.wolff@...b.com>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        David Ahern <dsahern@...il.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Yao Jin <yao.jin@...ux.intel.com>, kernel-team@....com
Subject: Re: [PATCH v2] perf report: distinguish between inliners in the same function

On Mittwoch, 10. Mai 2017 07:53:52 CEST Namhyung Kim wrote:
> Hi,
> 
> On Wed, May 03, 2017 at 11:35:36PM +0200, Milian Wolff wrote:

<snip>

> > +static enum match_result match_chain_srcline(struct callchain_cursor_node
> > *node, +					     struct callchain_list *cnode)
> > +{
> > +	char *left = get_srcline(cnode->ms.map->dso,
> > +				 map__rip_2objdump(cnode->ms.map, cnode->ip),
> > +				 cnode->ms.sym, true, false);
> > +	char *right = get_srcline(node->map->dso,
> > +				  map__rip_2objdump(node->map, node->ip),
> > +				  node->sym, true, false);
> > +	enum match_result ret = match_chain_strings(left, right);
> 
> I think we need to check inlined srcline as well.  There might be a
> case that two samples have different addresses (and from different
> callchains) but happens to be mapped to a same srcline IMHO.

I think I'm missing something, but isn't this what this function provides? The 
function above is now being used by the match_chain_inliner function below. 

Ah, or do you mean for code such as this:

~~~~~
inline_func_1(); inline_func_2();
~~~~~

Here, both branches could be inlined into the same line and the same issue 
would occur, i.e. different branches get collapsed into the first match for 
the given srcline?

Hm yes, this should be fixed too.

But, quite frankly, I think it just shows more and more that the current 
inliner support is really fragile and leads to lots of issues throughout the 
code base as the inlined frames are different from non-inlined frames, but 
should most of the same be handled just like them.

So, maybe it's time to once more think about going back to my initial 
approach: Make inlined frames code-wise equal to non-inlined frames, i.e. 
instead of requesting the inlined frames within match_chain, do it outside and 
create callchain_node/callchain_cursor instances (not sure which one right 
now) for the inlined frames too.

This way, we should be able to centrally add support for inlined frames and 
all areas will benefit from it:

- aggregation by srcline/function will magically work
- all browsers will automatically display them, i.e. no longer need to 
duplicate the code for inliner support in perf script, perf report tui/
stdio/...
- we can easily support --inline in other tools, like `perf trace --call-
graph`

So before I invest more time trying to massage match_chain to behave well for 
inline nodes, can I get some feedback on the above?

Back then when Jin and me discussed this, noone from the core perf 
contributors ever bothered to give us any insight in what they think is the 
better approach.

> > +
> > 
> >  	free_srcline(left);
> >  	free_srcline(right);
> >  	return ret;
> >  
> >  }
> > 
> > +static enum match_result match_chain_inliner(struct callchain_cursor_node
> > *node, +					     struct callchain_list *cnode)
> > +{
> > +	u64 left_ip = map__rip_2objdump(cnode->ms.map, cnode->ip);
> > +	u64 right_ip = map__rip_2objdump(node->map, node->ip);
> > +	struct inline_node *left_node = NULL;
> > +	struct inline_node *right_node = NULL;
> > +	struct inline_list *left_entry = NULL;
> > +	struct inline_list *right_entry = NULL;
> > +	struct inline_list *left_last_entry = NULL;
> > +	struct inline_list *right_last_entry = NULL;
> > +	enum match_result ret = MATCH_EQ;
> > +
> > +	left_node = dso__parse_addr_inlines(cnode->ms.map->dso, left_ip);
> > +	if (!left_node)
> > +		return MATCH_ERROR;
> > +
> > +	right_node = dso__parse_addr_inlines(node->map->dso, right_ip);
> > +	if (!right_node) {
> > +		inline_node__delete(left_node);
> > +		return MATCH_ERROR;
> > +	}
> > +
> > +	left_entry = list_first_entry(&left_node->val,
> > +				      struct inline_list, list);
> > +	left_last_entry = list_last_entry(&left_node->val,
> > +					  struct inline_list, list);
> > +	right_entry = list_first_entry(&right_node->val,
> > +				       struct inline_list, list);
> > +	right_last_entry = list_last_entry(&right_node->val,
> > +					  struct inline_list, list);
> 
> What about keeping number of entries in a inline_node so that we can
> check the numbers for faster comparison?

What benefit would that have? The performance cost is dominated by finding the 
inlined nodes, not by doing the comparison on the callstack.

> > +	while (ret == MATCH_EQ && (left_entry || right_entry)) {
> > +		ret = match_chain_strings(left_entry ? left_entry->funcname : 
NULL,
> > +					  right_entry ? right_entry->funcname : NULL);
> > +
> > +		if (left_entry && left_entry != left_last_entry)
> > +			left_entry = list_next_entry(left_entry, list);
> > +		else
> > +			left_entry = NULL;
> > +
> > +		if (right_entry && right_entry != right_last_entry)
> > +			right_entry = list_next_entry(right_entry, list);
> > +		else
> > +			right_entry = NULL;
> > +	}
> > +
> > +	inline_node__delete(left_node);
> > +	inline_node__delete(right_node);
> > +	return ret;
> > +}
> > +
> > 
> >  static enum match_result match_chain(struct callchain_cursor_node *node,
> >  
> >  				     struct callchain_list *cnode)
> >  
> >  {
> > 
> > @@ -671,7 +728,13 @@ static enum match_result match_chain(struct
> > callchain_cursor_node *node,> 
> >  	}
> >  	
> >  	if (left == right) {
> > 
> > -		if (node->branch) {
> > +		if (symbol_conf.inline_name && cnode->ip != node->ip) {
> > +			enum match_result match = match_chain_inliner(node,
> > +								      cnode);
> > +
> > +			if (match != MATCH_ERROR)
> > +				return match;
> 
> I guess it'd be better just returning the match result.  Otherwise
> MATCH_ERROR will be converted to MATCH_EQ..

This is done on purpose to fall-back to the IP-based comparison. That way, 
entries without inlined nodes will be sorted the same way as before this 
patch.

Cheers
-- 
Milian Wolff | milian.wolff@...b.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5903 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ