linux-kernel - Re: [PATCH 00/10] Stitch LBR call stack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ba3025a0-2161-b1d6-0a37-3445eebe7609@linux.intel.com>
Date:   Mon, 7 Oct 2019 16:06:24 -0400
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     peterz@...radead.org, acme@...nel.org,
        linux-kernel@...r.kernel.org, jolsa@...nel.org,
        namhyung@...nel.org, ak@...ux.intel.com,
        vitaly.slobodskoy@...el.com, pavel.gerasimov@...el.com
Subject: Re: [PATCH 00/10] Stitch LBR call stack



On 10/7/2019 2:24 PM, Ingo Molnar wrote:
> 
> * kan.liang@...ux.intel.com <kan.liang@...ux.intel.com> wrote:
> 
>> Performance impact:
>> The processing time may increase with the LBR stitching approach
>> enabled. The impact depends on the number of samples with stitched LBRs.
>>
>> For sqlite's tcltest,
>> perf record --call-graph lbr -- make tcltest
>> perf report --stitch-lbr
>>
>> There are 4.11% samples has stitched LBRs.
>> Total number of samples:                        2833728
>> The number of samples with stitched LBRs        116478
>>
>> The processing time of perf report increases 6.8%
>> Without --stitch-lbr:                           55906106 usec
>> With --stitch-lbr:                              59728701 usec
>>
>> For a simple test case tchain_edit with 43 depth of call stacks.
>> perf record --call-graph lbr -- ./tchain_edit
>> perf report --stitch-lbr
>>
>> There are 99.9% samples has stitched LBRs.
>> Total number of samples:                        10915
>> The number of samples with stitched LBRs        10905
>>
>> The processing time of perf report increases 67.4%
>> Without --stitch-lbr:                           11970508 usec
>> With --stitch-lbr:                              20036055 usec
> 
> That cost seems pretty high, while the feature sounds useful - is there
> any way to speed this up?
> 

For each LBR entry, perf tool will calculate and generate an appended 
node for callchain_cursor.
The stitched LBR entries are from previous sample. It looks like we 
don't need to do the calculation again for them. That should speed up 
the whole process. I will do more test for it.

Thanks,
Kan