linux-kernel - Re: [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7beb0d9e-8707-4421-a88b-cc494ef1e880@amd.com>
Date: Mon, 12 May 2025 15:31:11 +0530
From: Ravi Bangoria <ravi.bangoria@....com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
 Ian Rogers <irogers@...gle.com>, Kan Liang <kan.liang@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
 Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
 LKML <linux-kernel@...r.kernel.org>,
 "linux-perf-users@...r.kernel.org" <linux-perf-users@...r.kernel.org>,
 Leo Yan <leo.yan@....com>, Stephane Eranian <eranian@...gle.com>,
 Ravi Bangoria <ravi.bangoria@....com>
Subject: Re: [RFC/PATCHSET 00/11] perf mem: Add new output fields for data
 source (v1)

Hi Namhyung,

>>> The name of some new fields are the same as the corresponding sort
>>> keys (mem, op, snoop) so I had to change the order whether it's
>>> applied as an output field or a sort key.  Maybe it's better to name
>>> them differently but I couldn't come up with better ideas.
>>
>> 1) These semantic changes of the field name seems counter intuitive
>>    (to me). Example:
>>
>>    -F mem:
>>
>>      Without patch:
>>
>>      $ perf mem report -F overhead,sample,mem --stdio
>>      # Overhead       Samples  Memory access
>>          39.29%             1  L3 hit
>>          37.50%            21  N/A
>>          23.21%            13  L1 hit
>>
>>      With patch:
>>
>>      $ perf mem report -F overhead,sample,mem --stdio
>>      #                          Memory
>>      # Overhead       Samples    Other
>>         100.00%            35   100.0%
> 
> Yep, that's because I split the 'mem' part to 'cache' and 'mem' because
> he_mem_stat can handle up to 8 entries.

+1.

>  As your samples hit mostly in
> the caches, you'd get the similar result when you run:
> 
>   $ perf mem report -F overhead,sample,cache --stdio
> 
>>
>>    -F 'snoop':
>>
>>      Without patch:
>>
>>      $ perf mem report -F overhead,sample,snoop --stdio
>>      # Overhead       Samples  Snoop
>>          60.71%            34  N/A
>>          39.29%             1  HitM
>>    
>>      With patchset:
>>
>>      $ perf mem report -F overhead,sample,snoop --stdio
>>      #                         --- Snoop ----
>>      # Overhead       Samples     HitM  Other
>>         100.00%            35    39.3%  60.7%
> 
> This matches to 'Overhead' distribution without patch, right?

Right, it does.

>> 2) It was not intuitive (to me:)) that perf-mem overhead is calculated
>>    using sample->weight by overwriting sample->period. I also don't see
>>    it documented anywhere (or did I miss it?)
> 
> I don't see the documentation and I also find it confusing.  Sometimes I
> think the weight is better but sometimes not. :(  At least we could add
> and option to control that (like --use-weight ?).

this and below ...

> Also we now have 'weight' output field so users can see it, althought it
> shows averages.
> 
>>
>>    perf report:
>>
>>      $ perf report -F overhead,sample,period,dso --stdio
>>      # Overhead  Samples   Period  Shared Object
>>          80.00%       28  2800000  [kernel.kallsyms]
>>           5.71%        2   200000  ld-linux-x86-64.so.2
>>           5.71%        2   200000  libc.so.6
>>           5.71%        2   200000  ls
>>           2.86%        1   100000  libpcre2-8.so.0.11.2
>>
>>    perf mem report:
>>
>>      $ perf mem report -F overhead,sample,period,dso --stdio
>>      # Overhead  Samples   Period  Shared Object
>>          87.50%       28       49  [kernel.kallsyms]
>>           3.57%        2        2  ld-linux-x86-64.so.2
>>           3.57%        2        2  libc.so.6
>>           3.57%        2        2  ls
>>           1.79%        1        1  libpcre2-8.so.0.11.2
>>
>> 3) Similarly, it was not intuitive (again, to me:)) that -F op/snoop/dtlb
>>    percentages are calculated based on sample->weight.
> 
> Hmm.. ok.  Maybe better to use the original period for percentage
> breakdown in the new output fields.  For examples, in the above result
> you have 13 samples for L1 and 1 sample for L3 but the weight of L3
> access is bigger.  But I guess users probably want to see L1 access was
> dominant.

... I'm also not sure. Logically, it makes sense to use weight as overhead.
Also it dates back to ~2014 and nobody has complained so far. So I'm just
being pedantic 🙂. For now, how about just document it in the perf-mem man
page and leave it. Attaching the patch at the end.

>> 4) I've similar recommended perf-mem command in perf-amd-ibs man page.
>>    Can you please update alternate command there.
>>    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-amd-ibs.txt?h=v6.15-rc5#n167
> 
> Sure will do.

Thanks!

------------><---------------
>From 7e4393ab7b20f8d89a5dece08fdd925e3e50b15a Mon Sep 17 00:00:00 2001
From: Ravi Bangoria <ravi.bangoria@....com>
Date: Mon, 12 May 2025 06:22:57 +0000
Subject: [PATCH] perf mem doc: Describe overhead calculation in brief

Unlike perf-report which uses sample period for overhead calculation,
perf-mem overhead is calculated using sample weight. Describe perf-mem
overhead calculation method in it's man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@....com>
---
 tools/perf/Documentation/perf-mem.txt | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentation/perf-mem.txt
index a9e3c71a2205..965e73d37772 100644
--- a/tools/perf/Documentation/perf-mem.txt
+++ b/tools/perf/Documentation/perf-mem.txt
@@ -137,6 +137,25 @@ REPORT OPTIONS
 In addition, for report all perf report options are valid, and for record
 all perf record options.
 
+OVERHEAD CALCULATION
+--------------------
+Unlike linkperf:perf-report[1], which calculates overhead from the actual
+sample period, perf-mem overhead is calculated using sample weight. E.g.
+there are two samples in perf.data file, both with the same sample period,
+but one sample with weight 180 and the other with weight 20:
+
+  $ perf script -F period,data_src,weight,ip,sym
+  100000    629080842 |OP LOAD|LVL L3 hit|...     20       7e69b93ca524 strcmp
+  100000   1a29081042 |OP LOAD|LVL RAM hit|...   180   ffffffff82429168 memcpy
+
+  $ perf report -F overhead,symbol
+  50%   [.] strcmp
+  50%   [k] memcpy
+
+  $ perf mem report -F overhead,symbol
+  90%   [k] memcpy
+  10%   [.] strcmp
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1]
-- 
2.43.0

Thanks,
Ravi