linux-kernel - Re: [BUG] perf report --pid not reporting correctly

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 21 Sep 2016 19:22:29 -0700
From:   Stephane Eranian <eranian@...gle.com>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     Jiri Olsa <jolsa@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        "mingo@...e.hu" <mingo@...e.hu>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: [BUG] perf report --pid not reporting correctly

On Wed, Sep 21, 2016 at 6:34 PM, Namhyung Kim <namhyung@...nel.org> wrote:
> On Wed, Sep 21, 2016 at 01:18:52PM -0700, Stephane Eranian wrote:
>> On Wed, Sep 21, 2016 at 9:34 AM, Jiri Olsa <jolsa@...hat.com> wrote:
>> > On Wed, Sep 21, 2016 at 12:37:53PM -0300, Arnaldo Carvalho de Melo wrote:
>> >> Em Tue, Sep 20, 2016 at 06:29:59PM -0700, Stephane Eranian escreveu:
>> >> > Hi Arnaldo,
>> >> >
>> >> > I ran into an issue trying to use the --pid filtering option of perf report.
>> >> >
>> >> > I do a system-wide collection and then I want to narrow down the
>> >> > reporting to a specific process:
>> >> >
>> >> > $ perf record -a -e cycles:pp sleep 10
>> >> > $ perf report --sort cpu,comm --pid X
>> >> >
>> >> > Where X is a process sampled during the run (easy to catch with perf report -D)
>> >> > If you do it this way, it works, but if you do:
>> >> >
>> >> > $ perf report --sort cpu --pid X
>> >> >
>> >> > Then you get an empty output.
>> >> >
>> >> > I suspect it has to do with the way hist entries are added to the
>> >> > histogram and aggregated. If the first event for a sort criteria is
>> >> > not coming from pid X, it will
>> >> > still be added in the histogram. if pid X aggregates to the same
>> >> > sample criteria, then you will lose the pid information. And then
>> >> > later when you try to apply the filter,
>> >> > it will mark the hist entry as FILTERED because it does not have a matching pid
>> >> > and nothing will be printed.
>> >> > I suspect you want to apply the filtering upfront for pid. It will
>> >> > only add to the histograms matching samples. It changes the
>> >> > percentages you will see. They will
>> >> > only report the breakdown for the pid.
>> >> >
>> >> > I have a quick hack to do upfront filtering which does something as
>> >> > follows but I am not sure this is the correct way of doing this.
>> >> >
>> >> > Let me know what you think.
>> >>
>> >> From a first look I think this makes sense, i.e. we should do the first
>> >> round of filtering, one that trows away stuff, for things in the command
>> >> line, when creating the histogram entries.
>> >>
>> >> Later, as we have now, we can apply further filters for non-collapsed
>> >> fields of hist_entry.
>> >>
>> >> Jiri, Namhyung, are you ok with this?
>> >
>> > Stephan is correct with analysis, but I think we need to add both non/filtered
>> > entries in, because we provide that 'F' key for non/filtered counts switch in tui
>> >
>> > how about something like below
>> >
>> > thanks,
>> > jirka
>> >
>> > ---
>> > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
>> > index b02992efb513..659e0357be68 100644
>> > --- a/tools/perf/util/hist.c
>> > +++ b/tools/perf/util/hist.c
>> > @@ -536,6 +536,14 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
>> >                                 map__put(he->ms.map);
>> >                                 he->ms.map = map__get(entry->ms.map);
>> >                         }
>> > +
>> > +                       /*
>> > +                        * We have at least one entry in which is not
>> > +                        * filtered, we want to display the entry.
>> > +                        */
>> > +                       if (he->filtered && !entry->filtered)
>> > +                               he->filtered = 0;
>> > +
>> >                         goto out;
>> >                 }
>> >
>> Works for me. So with this approach the % shown with --pid still
>> represents % of total samples and not just for that pid.
>> I think this is okay as long as this is documented and understood by users.
>> Thanks.
>
> I think we should show correct value depending on the --percentage
> option.   I wrote a patch to implement it by addding a
> total_early_filtered_period stat to hists.  Following is the result:
>
>
>   $ perf report -s cpu,comm --pid 0 --stdio
>   #
>   # Overhead  CPU  Command
>   # ........  ...  .......
>   #
>       12.16%  000  swapper
>        3.09%  001  swapper
>        2.76%  002  swapper
>        2.23%  003  swapper
>        1.65%  007  swapper
>        1.65%  008  swapper
>        1.52%  009  swapper
>        1.51%  006  swapper
>        1.46%  004  swapper
>        1.34%  005  swapper
>        0.94%  010  swapper
>        0.90%  011  swapper
>
So how do I interpret this?

Is this that 12.16% of all samples comes from  pid 0 (swapper) running on CPU0?

>   $ perf report -s cpu --pid 0 --stdio
>   #
>   # Overhead  CPU
>   # ........  ...
>   #
>       12.16%  000
>        3.09%  001
>        2.76%  002
>        2.23%  003
>        1.65%  007
>        1.65%  008
>        1.52%  009
>        1.51%  006
>        1.46%  004
>        1.34%  005
>        0.94%  010
>        0.90%  011
>
12.16% of all the samples collected come from pid 0 (swapper) running CPU0?


>   $ perf report -s cpu --pid 0 --stdio --percentage relative
>   #
>   # Overhead  CPU
>   # ........  ...
>   #
>       38.95%  000
>        9.92%  001
>        8.84%  002
>        7.16%  003
>        5.30%  007
>        5.28%  008
>        4.87%  009
>        4.83%  006
>        4.66%  004
>        4.30%  005
>        3.00%  010
>        2.89%  011
>
Ok, so now I am guessing 38.95% of the samples of pid 0 are on CPU0?

>
>   Note that the --hierarchy option provides groups rather than filtering
>   but shows similar result..
>
>   $ perf report -s pid,cpu --stdio --hierarchy
>   #
>   #    Overhead  Pid:Command / CPU
>   # ...........  .......................
>   #
>       31.21%     0:swapper
>          12.16%     000
>           3.09%     001
>           2.76%     002
>           2.23%     003
>           1.65%     007
>           1.65%     008
>           1.52%     009
>           1.51%     006
>           1.46%     004
>           1.34%     005
>           0.94%     010
>           0.90%     011
>       19.15%     8618:getmail
>          ...
>
31.21% of total samples come from pid 0 (swapper) and decompose
to 12.16% for CPU0, 3.09% for CPU1, ....

Is that right?

>
>
>
>
> -----------------------8<-------------------------------
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 8d363d5e65a2..42b1bfd29ef8 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -262,6 +262,7 @@ enum auxtrace_error_type {
>   */
>  struct events_stats {
>         u64 total_period;
> +       u64 total_early_filtered_period;
>         u64 total_non_filtered_period;
>         u64 total_lost;
>         u64 total_lost_samples;
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 37a08f20730a..c7045411cce2 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -1017,12 +1017,21 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
>                          int max_stack_depth, void *arg)
>  {
>         int err, err2;
> +       struct hists *hists = evsel__hists(iter->evsel);
>
>         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
>                                         iter->evsel, al, max_stack_depth);
>         if (err)
>                 return err;
>
> +       if (symbol__parent_filter(iter->parent))
> +               al->filtered |= symbol__parent_filter(iter->parent);
> +
> +       if (al->filtered) {
> +               hists->stats.total_early_filtered_period += iter->sample->period;
> +               return 0;
> +       }
> +
>         iter->max_stack = max_stack_depth;
>
>         err = iter->ops->prepare_entry(iter, al);
> @@ -1503,7 +1512,7 @@ static void hists__reset_filter_stats(struct hists *hists)
>  void hists__reset_stats(struct hists *hists)
>  {
>         hists->nr_entries = 0;
> -       hists->stats.total_period = 0;
> +       hists->stats.total_period = hists->stats.total_early_filtered_period;
>
>         hists__reset_filter_stats(hists);
>  }
> @@ -1530,7 +1539,7 @@ static void hierarchy_recalc_total_periods(struct hists *hists)
>
>         node = rb_first(&hists->entries);
>
> -       hists->stats.total_period = 0;
> +       hists->stats.total_period = hists->stats.total_early_filtered_period;
>         hists->stats.total_non_filtered_period = 0;
>
>         /*