lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250430205548.789750-1-namhyung@kernel.org>
Date: Wed, 30 Apr 2025 13:55:37 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Kan Liang <kan.liang@...ux.intel.com>
Cc: Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-perf-users@...r.kernel.org,
	Ravi Bangoria <ravi.bangoria@....com>,
	Leo Yan <leo.yan@....com>
Subject: [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1)

Hello,

The perf mem uses PERF_SAMPLE_DATA_SRC which has a lot of information
for memory access.  It has various sort keys to group related samples
together but it's still cumbersome to see the result.  While perf c2c
command provides a way to investigate the data in a specific way, I'd
like to add more generic ways using new output fields.

For example, the following is the 'cache' output field which breaks
down the sample weights into different level of caches.

  $ perf mem record -a sleep 1
  
  $ perf mem report -F cache,dso,sym --stdio
  ...
  #
  # -------------- Cache --------------
  #      L1     L2     L3 L1-buf  Other  Shared Object                                  Symbol
  # ...................................  .....................................  .........................................
  #
       0.0%   0.0%   0.0%   0.0% 100.0%  [kernel.kallsyms]                      [k] ioread8
     100.0%   0.0%   0.0%   0.0%   0.0%  [kernel.kallsyms]                      [k] _raw_spin_lock_irq
       0.0%   0.0%   0.0%   0.0% 100.0%  [xhci_hcd]                             [k] xhci_update_erst_dequeue
       0.0%   0.0%   0.0%  95.8%   4.2%  [kernel.kallsyms]                      [k] smaps_account
       0.6%   1.8%  22.7%  45.5%  29.5%  [kernel.kallsyms]                      [k] sched_balance_update_blocked_averages
      29.4%   0.0%   1.6%  58.8%  10.2%  [kernel.kallsyms]                      [k] __update_load_avg_cfs_rq
       0.0%   8.5%   4.3%   0.0%  87.2%  [kernel.kallsyms]                      [k] copy_mc_enhanced_fast_string
      63.9%   0.0%   8.0%  23.8%   4.3%  [kernel.kallsyms]                      [k] psi_group_change
       3.9%   0.0%   9.3%  35.7%  51.1%  [kernel.kallsyms]                      [k] timerqueue_add
      35.9%  10.9%   0.0%  39.0%  14.2%  [kernel.kallsyms]                      [k] memcpy
      94.1%   0.0%   0.0%   5.9%   0.0%  [kernel.kallsyms]                      [k] unmap_page_range
      25.7%   0.0%   4.9%  51.0%  18.4%  [kernel.kallsyms]                      [k] __update_load_avg_se
       0.0%  24.9%  19.4%   9.6%  46.1%  [kernel.kallsyms]                      [k] _copy_to_iter
      12.9%   0.0%   0.0%  87.1%   0.0%  [kernel.kallsyms]                      [k] next_uptodate_folio
      36.8%   0.0%   9.5%  16.6%  37.1%  [kernel.kallsyms]                      [k] update_curr
     100.0%   0.0%   0.0%   0.0%   0.0%  bpf_prog_b9611ccbbb3d1833_dfs_iter     [k] bpf_prog_b9611ccbbb3d1833_dfs_iter
      45.4%   1.8%  20.4%  23.6%   8.8%  [kernel.kallsyms]                      [k] audit_filter_rules.isra.0
      92.8%   0.0%   0.0%   7.2%   0.0%  [kernel.kallsyms]                      [k] filemap_map_pages
      10.6%   0.0%   0.0%  89.4%   0.0%  [kernel.kallsyms]                      [k] smaps_page_accumulate
      38.3%   0.0%  29.6%  27.1%   5.0%  [kernel.kallsyms]                      [k] __schedule

Please see the description of each commit for other fields.

New mem_stat field was added to the hist_entry to save this
information.  It's a generic data structure (array) to handle
different type of information like cache-level, memory location,
snoop-result, etc.

The first patch is a fix for the hierarchy mode and it was sent
separately.  I just add it here not to break the hierarchy mode.  The
second patch is to enable SAMPLE_DATA_SRC without SAMPLE_ADDR and
perf_event_attr.mmap_data which generate a lot more data.

The name of some new fields are the same as the corresponding sort
keys (mem, op, snoop) so I had to change the order whether it's
applied as an output field or a sort key.  Maybe it's better to name
them differently but I couldn't come up with better ideas.

That means, you need to use -F/--fields option to specify those fields
and the sort keys you want.  Maybe we can change the default output
and sort keys for perf mem report with this.

The code is available at 'perf/mem-field-v1' branch in

 git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (11):
  perf hist: Remove output field from sort-list properly
  perf record: Add --sample-mem-info option
  perf hist: Support multi-line header
  perf hist: Add struct he_mem_stat
  perf hist: Basic support for mem_stat accounting
  perf hist: Implement output fields for mem stats
  perf mem: Add 'op' output field
  perf hist: Hide unused mem stat columns
  perf mem: Add 'cache' and 'memory' output fields
  perf mem: Add 'snoop' output field
  perf mem: Add 'dtlb' output field

 tools/perf/Documentation/perf-record.txt |   7 +-
 tools/perf/builtin-record.c              |   6 +
 tools/perf/ui/browsers/hists.c           |  50 ++++-
 tools/perf/ui/hist.c                     | 272 ++++++++++++++++++++++-
 tools/perf/ui/stdio/hist.c               |  57 +++--
 tools/perf/util/evsel.c                  |   2 +-
 tools/perf/util/hist.c                   |  78 +++++++
 tools/perf/util/hist.h                   |  22 ++
 tools/perf/util/mem-events.c             | 183 ++++++++++++++-
 tools/perf/util/mem-events.h             |  57 +++++
 tools/perf/util/record.h                 |   1 +
 tools/perf/util/sort.c                   |  42 +++-
 12 files changed, 718 insertions(+), 59 deletions(-)

-- 
2.49.0.906.g1f30a19c02-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ