[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba3025a0-2161-b1d6-0a37-3445eebe7609@linux.intel.com>
Date: Mon, 7 Oct 2019 16:06:24 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: peterz@...radead.org, acme@...nel.org,
linux-kernel@...r.kernel.org, jolsa@...nel.org,
namhyung@...nel.org, ak@...ux.intel.com,
vitaly.slobodskoy@...el.com, pavel.gerasimov@...el.com
Subject: Re: [PATCH 00/10] Stitch LBR call stack
On 10/7/2019 2:24 PM, Ingo Molnar wrote:
>
> * kan.liang@...ux.intel.com <kan.liang@...ux.intel.com> wrote:
>
>> Performance impact:
>> The processing time may increase with the LBR stitching approach
>> enabled. The impact depends on the number of samples with stitched LBRs.
>>
>> For sqlite's tcltest,
>> perf record --call-graph lbr -- make tcltest
>> perf report --stitch-lbr
>>
>> There are 4.11% samples has stitched LBRs.
>> Total number of samples: 2833728
>> The number of samples with stitched LBRs 116478
>>
>> The processing time of perf report increases 6.8%
>> Without --stitch-lbr: 55906106 usec
>> With --stitch-lbr: 59728701 usec
>>
>> For a simple test case tchain_edit with 43 depth of call stacks.
>> perf record --call-graph lbr -- ./tchain_edit
>> perf report --stitch-lbr
>>
>> There are 99.9% samples has stitched LBRs.
>> Total number of samples: 10915
>> The number of samples with stitched LBRs 10905
>>
>> The processing time of perf report increases 67.4%
>> Without --stitch-lbr: 11970508 usec
>> With --stitch-lbr: 20036055 usec
>
> That cost seems pretty high, while the feature sounds useful - is there
> any way to speed this up?
>
For each LBR entry, perf tool will calculate and generate an appended
node for callchain_cursor.
The stitched LBR entries are from previous sample. It looks like we
don't need to do the calculation again for them. That should speed up
the whole process. I will do more test for it.
Thanks,
Kan
Powered by blists - more mailing lists