linux-kernel - Re: [PATCH] perf topdown: Correct leader selection with sample

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c0a9ffb6-e6ea-4159-9cc0-a23df5e59429@linux.intel.com>
Date: Mon, 1 Jul 2024 17:51:09 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>, Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
 Dapeng Mi <dapeng1.mi@...el.com>
Subject: Re: [PATCH] perf topdown: Correct leader selection with sample_read
 enabled


On 6/29/2024 4:27 AM, Liang, Kan wrote:
>
> On 2024-06-28 2:28 p.m., Ian Rogers wrote:
>> On Thu, Jun 27, 2024 at 11:17 PM Mi, Dapeng <dapeng1.mi@...ux.intel.com> wrote:
>>> On 6/27/2024 11:11 PM, Liang, Kan wrote:
>>>> On 2024-06-14 5:39 p.m., Dapeng Mi wrote:
>>>>
>>>> Besides, we need a test for the sampling read as well.
>>>> Ian has provided a very good base. Please add a topdown sampling read
>>>> case on top of it as well.
>>>> https://lore.kernel.org/lkml/CAP-5=fUkg-cAXTb+3wbFOQCfdXgpQeZw40XHjfrNFbnBD=NMXg@mail.gmail.com/
>>> Sure. I would look at it and add a test case.
>> Thanks Dapeng and thanks Kan too! I wonder if we can do a regular
>> counter and a leader sample counter then compare the counts are
>> reasonably consistent. Something like this:
>>
>> ```
>> $ perf stat -e instructions perf test -w noploop
>>
>> Performance counter stats for '/tmp/perf/perf test -w noploop':
>>
>>    25,779,785,496      instructions
>>
>>       1.008047696 seconds time elapsed
>>
>>       1.003754000 seconds user
>>       0.003999000 seconds sys
>> ```
>>
>> ```
>> cat << "_end_of_file_" > a.py
>> last_count = None
>>
>> def process_event(param_dict):
>>    if ("ev_name" in param_dict and "sample" in param_dict and
>>        param_dict["ev_name"] == "instructions"):
>>        sample = param_dict["sample"]
>>        if "values" in sample:
>>            global last_count
>>            last_count = sample["values"][1][1]
>>
>> def trace_end():
>>    global last_count
>>    print(last_count)
>> _end_of_file_
>> $ sudo perf record -o -  -e "{cycles,instructions}:S" perf test -w
>> noploop|perf script -i - -s ./a.py
>> [ perf record: Woken up 2 times to write data ]
>> [ perf record: Captured and wrote 0.459 MB - ]
>> 22195356100
>> ```
>>
>> I didn't see a simpler way to get count and I don't think it is right.
> The perf stat can cover the whole life cycle of a workload. But I think
> the result of perf record can only give the sum from the beginning to
> the last sample.
> There are some differences.
>
>> There's some similar perf script checking of data in
>> tools/perf/tests/shell/test_intel_pt.sh.
>>
> I think the case should be to test the output of the perf script, rather
> than verify the accuracy of an event.
>
> If so, we may run two same events. They should show the exact same
> results in a sample.
>
> For example,
>
> perf record  -e "{branches,branches}:Su" -c 1000000 ./perf test -w brstack
> perf script
> perf  752598 349300.123884:    1000002 branches:      7f18676a875a
> do_lookup_x+0x2fa (/usr/lib64/l>
> perf  752598 349300.123884:    1000002 branches:      7f18676a875a
> do_lookup_x+0x2fa (/usr/lib64/l>
> perf  752598 349300.124854:    1000005 branches:      7f18676a90b6
> _dl_lookup_symbol_x+0x56 (/usr/>
> perf  752598 349300.124854:    1000005 branches:      7f18676a90b6
> _dl_lookup_symbol_x+0x56 (/usr/>
> perf  752598 349300.125914:     999998 branches:      7f18676a8556
> do_lookup_x+0xf6 (/usr/lib64/ld>
> perf  752598 349300.125914:     999998 branches:      7f18676a8556
> do_lookup_x+0xf6 (/usr/lib64/ld>
> perf  752598 349300.127401:    1000009 branches:            4c1adf
> brstack_bench+0x15 (/home/kan/o>
> perf  752598 349300.127401:    1000009 branches:            4c1adf
> brstack_bench+0x15 (/home/kan/o>

This looks a more accurate validation. I would add this test.


>
> Thanks,
> Kan
>
>> Thanks,
>> Ian
>>