lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250610005742.2173050-1-namhyung@kernel.org>
Date: Mon,  9 Jun 2025 17:57:42 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Kan Liang <kan.liang@...ux.intel.com>
Cc: Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-perf-users@...r.kernel.org,
	Ravi Bangoria <ravi.bangoria@....com>
Subject: [PATCH for-v6.16] perf mem: Describe the new output fields in the doc

Update the documentation of the new fields with examples and caveats.
Also update the related documentation for AMD IBS.

Cc: Ravi Bangoria <ravi.bangoria@....com>
Signed-off-by: Namhyung Kim <namhyung@...nel.org>
---
 tools/perf/Documentation/perf-amd-ibs.txt | 59 ++++++++++++++++-------
 tools/perf/Documentation/perf-mem.txt     | 50 +++++++++++++++++++
 2 files changed, 92 insertions(+), 17 deletions(-)

diff --git a/tools/perf/Documentation/perf-amd-ibs.txt b/tools/perf/Documentation/perf-amd-ibs.txt
index 55f80beae0375a72..54854993576070c3 100644
--- a/tools/perf/Documentation/perf-amd-ibs.txt
+++ b/tools/perf/Documentation/perf-amd-ibs.txt
@@ -171,23 +171,48 @@ Below is a simple example of the perf mem tool.
 	# perf mem report
 
 A normal perf mem report output will provide detailed memory access profile.
-However, it can also be aggregated based on output fields. For example:
-
-	# perf mem report -F mem,sample,snoop
-	Samples: 3M of event 'ibs_op//', Event count (approx.): 23524876
-	Memory access                                 Samples  Snoop
-	N/A                                           1903343  N/A
-	L1 hit                                        1056754  N/A
-	L2 hit                                          75231  N/A
-	L3 hit                                           9496  HitM
-	L3 hit                                           2270  N/A
-	RAM hit                                          8710  N/A
-	Remote node, same socket RAM hit                 3241  N/A
-	Remote core, same node Any cache hit             1572  HitM
-	Remote core, same node Any cache hit              514  N/A
-	Remote node, same socket Any cache hit           1216  HitM
-	Remote node, same socket Any cache hit            350  N/A
-	Uncached hit                                       18  N/A
+New output fields will show related access info together.  For example:
+
+	# perf mem report -F overhead,cache,snoop,comm
+	...
+	# Samples: 92K of event 'ibs_op//'
+	# Total weight : 531104
+	#
+	#           ---------- Cache -----------  --- Snoop ----
+	# Overhead       L1     L2 L1-buf  Other     HitM  Other  Command
+	# ........  ............................  ..............  ..........
+	#
+	    76.07%     5.8%  35.7%   0.0%  34.6%    23.3%  52.8%  cc1
+	     5.79%     0.2%   0.0%   0.0%   5.6%     0.1%   5.7%  make
+	     5.78%     0.1%   4.4%   0.0%   1.2%     0.5%   5.3%  gcc
+	     5.33%     0.3%   3.9%   0.0%   1.1%     0.2%   5.2%  as
+	     5.00%     0.1%   3.8%   0.0%   1.0%     0.3%   4.7%  sh
+	     1.56%     0.1%   0.1%   0.0%   1.4%     0.6%   0.9%  ld
+	     0.28%     0.1%   0.0%   0.0%   0.2%     0.1%   0.2%  pkg-config
+	     0.09%     0.0%   0.0%   0.0%   0.1%     0.0%   0.1%  git
+	     0.03%     0.0%   0.0%   0.0%   0.0%     0.0%   0.0%  rm
+	     ...
+
+Also, it can be aggregated based on various memory access info using the
+sort keys.  For example:
+
+	# perf mem report -s mem,snoop
+	...
+	# Samples: 92K of event 'ibs_op//'
+	# Total weight : 531104
+	# Sort order   : mem,snoop
+	#
+	# Overhead       Samples  Memory access                            Snoop
+	# ........  ............  .......................................  ............
+	#
+	    47.99%          1509  L2 hit                                   N/A
+	    25.08%           338  core, same node Any cache hit            HitM
+	    10.24%         54374  N/A                                      N/A
+	     6.77%         35938  L1 hit                                   N/A
+	     6.39%           101  core, same node Any cache hit            N/A
+	     3.50%            69  RAM hit                                  N/A
+	     0.03%           158  LFB/MAB hit                              N/A
+	     0.00%             2  Uncached hit                             N/A
 
 Please refer to their man page for more detail.
 
diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentation/perf-mem.txt
index 965e73d377724607..4d164836d0943119 100644
--- a/tools/perf/Documentation/perf-mem.txt
+++ b/tools/perf/Documentation/perf-mem.txt
@@ -119,6 +119,22 @@ REPORT OPTIONS
 	And the default sort keys are changed to local_weight, mem, sym, dso,
 	symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat.
 
+-F::
+--fields=::
+	Specify output field - multiple keys can be specified in CSV format.
+	Please see linkperf:perf-report[1] for details.
+
+	In addition to the default fields, 'perf mem report' will provide the
+	following fields to break down sample periods.
+
+	- op: operation in the sample instruction (load, store, prefetch, ...)
+	- cache: location in CPU cache (L1, L2, ...) where the sample hit
+	- mem: location in memory or other places the sample hit
+	- dtlb: location in Data TLB (L1, L2) where the sample hit
+	- snoop: snoop result for the sampled data access
+
+	Please take a look at the OUTPUT FIELD SELECTION section for caveats.
+
 -T::
 --type-profile::
 	Show data-type profile result instead of code symbols.  This requires
@@ -156,6 +172,40 @@ there are two samples in perf.data file, both with the same sample period,
   90%   [k] memcpy
   10%   [.] strcmp
 
+OUTPUT FIELD SELECTION
+----------------------
+"perf mem report" adds a number of new output fields specific to data source
+information in the sample.  Some of them have the same name with the existing
+sort keys ("mem" and "snoop").  So unlike other fields and sort keys, they'll
+behave differently when it's used by -F/--fields or -s/--sort.
+
+Using those two as output fields will aggregate samples altogether and show
+breakdown.
+
+  $ perf mem report -F mem,snoop
+  ...
+  # ------ Memory -------  --- Snoop ----
+  #     RAM Uncach  Other     HitM  Other
+  # .....................  ..............
+  #
+       3.5%   0.0%  96.5%    25.1%  74.9%
+
+But using the same name for sort keys will aggregate samples for each type
+separately.
+
+  $ perf mem report -s mem,snoop
+  # Overhead       Samples  Memory access                            Snoop
+  # ........  ............  .......................................  ............
+  #
+      47.99%          1509  L2 hit                                   N/A
+      25.08%           338  core, same node Any cache hit            HitM
+      10.24%         54374  N/A                                      N/A
+       6.77%         35938  L1 hit                                   N/A
+       6.39%           101  core, same node Any cache hit            N/A
+       3.50%            69  RAM hit                                  N/A
+       0.03%           158  LFB/MAB hit                              N/A
+       0.00%             2  Uncached hit                             N/A
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1]
-- 
2.50.0.rc0.604.gd4ff7b7c86-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ