[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F691289.7080505@fb.com>
Date: Tue, 20 Mar 2012 16:28:09 -0700
From: Arun Sharma <asharma@...com>
To: Frederic Weisbecker <fweisbec@...il.com>
CC: <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
"Arnaldo Carvalho de Melo" <acme@...hat.com>,
Mike Galbraith <efault@....de>,
"Paul Mackerras" <paulus@...ba.org>,
Peter Zijlstra <peterz@...radead.org>,
"Stephane Eranian" <eranian@...gle.com>,
Namhyung Kim <namhyung.kim@....com>,
"Tom Zanussi" <tzanussi@...il.com>,
<linux-perf-users@...r.kernel.org>
Subject: Re: [PATCH] perf: Add a new sort order: SORT_INCLUSIVE (v4)
On 3/19/12 8:57 AM, Frederic Weisbecker wrote:
>>> Each hist have a period of 1, but the total period is 1.
>>> So the end result should be (IIUC):
>>>
>>> 100% foo a
>>> 100% foo b
>>> |
>>> --- a
>>> 100% foo c
>>> |
>>> --- b
>>> |
>>> --- c
>>>
>>
>> That is correct. The first column no longer adds up to 100%.
>
> So do we really want this?
>
I think so. It's a different way of presenting the data. Pie chart vs a
bar chart of OS market share where people may be using more than one OS.
I'll post some documentation updates.
>> If we don't do this, total_period will be inflated.
>
> Yeah right I've just tried and callchains look right. I'm just puzzled
> by the percentages:
>
Thanks for testing this!
> + 98,99% [k] execve
> + 98,99% [k] stub_execve
> + 98,99% [k] do_execve
> + 98,99% [k] do_execve_common
> + 98,99% [k] sys_execve
> + 53,12% [k] __libc_start_main
> + 53,12% [k] cmd_record
These look like they belong to the perf binary and are incorrectly
classified as kernel samples. Problem is that callchain_get() is not
populating the privilege level - it's simply propagating the privilege
level of the sample:
+ for (i = 0; i < cursor->nr; i++) {
+ struct addr_location al_child = *al;
+
+ err = callchain_get(&iter, &al_child);
Not all fields of al_child are populated by callchain_get().
> + 53,12% [k] T.101
> + 53,12% [k] main
> + 53,12% [k] run_builtin
> + 52,11% [k] perf_evlist__prepare_workload
> + 52,09% [k] T.1163
The rest of them look ok to me. If something doesn't make sense, please
point me at the output of "perf script".
>
>>
>>> Also this feature reminds me a lot the -b option in perf report.
>>> Branch sorting and callchain inclusive sorting are a bit different in
>>> the way they handle the things but the core idea is the same. Callchains
>>> are branches as well.
>>>
>>
>> Yes - I kept asking why the branch stack stuff doesn't use the
>> existing callchain logic.
>
> Because I fear that loops branches could make the tree representation useless.
>
The loops could happen in callgraphs too right (eg: recursive programs)?
The other problem in branch stacks/LBR is that they're sampled branches.
Just because I got a sample with:
a -> b
b -> c
doesn't necessarily mean that the callchain was a -> b -> c.
I still don't have the branch stack setup working properly. But I'm now
more sympathetic to the view that last branch sampling and callchains
may have different representations in perf.
-Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists