[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250430205548.789750-1-namhyung@kernel.org>
Date: Wed, 30 Apr 2025 13:55:37 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Kan Liang <kan.liang@...ux.intel.com>
Cc: Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-perf-users@...r.kernel.org,
Ravi Bangoria <ravi.bangoria@....com>,
Leo Yan <leo.yan@....com>
Subject: [RFC/PATCHSET 00/11] perf mem: Add new output fields for data source (v1)
Hello,
The perf mem uses PERF_SAMPLE_DATA_SRC which has a lot of information
for memory access. It has various sort keys to group related samples
together but it's still cumbersome to see the result. While perf c2c
command provides a way to investigate the data in a specific way, I'd
like to add more generic ways using new output fields.
For example, the following is the 'cache' output field which breaks
down the sample weights into different level of caches.
$ perf mem record -a sleep 1
$ perf mem report -F cache,dso,sym --stdio
...
#
# -------------- Cache --------------
# L1 L2 L3 L1-buf Other Shared Object Symbol
# ................................... ..................................... .........................................
#
0.0% 0.0% 0.0% 0.0% 100.0% [kernel.kallsyms] [k] ioread8
100.0% 0.0% 0.0% 0.0% 0.0% [kernel.kallsyms] [k] _raw_spin_lock_irq
0.0% 0.0% 0.0% 0.0% 100.0% [xhci_hcd] [k] xhci_update_erst_dequeue
0.0% 0.0% 0.0% 95.8% 4.2% [kernel.kallsyms] [k] smaps_account
0.6% 1.8% 22.7% 45.5% 29.5% [kernel.kallsyms] [k] sched_balance_update_blocked_averages
29.4% 0.0% 1.6% 58.8% 10.2% [kernel.kallsyms] [k] __update_load_avg_cfs_rq
0.0% 8.5% 4.3% 0.0% 87.2% [kernel.kallsyms] [k] copy_mc_enhanced_fast_string
63.9% 0.0% 8.0% 23.8% 4.3% [kernel.kallsyms] [k] psi_group_change
3.9% 0.0% 9.3% 35.7% 51.1% [kernel.kallsyms] [k] timerqueue_add
35.9% 10.9% 0.0% 39.0% 14.2% [kernel.kallsyms] [k] memcpy
94.1% 0.0% 0.0% 5.9% 0.0% [kernel.kallsyms] [k] unmap_page_range
25.7% 0.0% 4.9% 51.0% 18.4% [kernel.kallsyms] [k] __update_load_avg_se
0.0% 24.9% 19.4% 9.6% 46.1% [kernel.kallsyms] [k] _copy_to_iter
12.9% 0.0% 0.0% 87.1% 0.0% [kernel.kallsyms] [k] next_uptodate_folio
36.8% 0.0% 9.5% 16.6% 37.1% [kernel.kallsyms] [k] update_curr
100.0% 0.0% 0.0% 0.0% 0.0% bpf_prog_b9611ccbbb3d1833_dfs_iter [k] bpf_prog_b9611ccbbb3d1833_dfs_iter
45.4% 1.8% 20.4% 23.6% 8.8% [kernel.kallsyms] [k] audit_filter_rules.isra.0
92.8% 0.0% 0.0% 7.2% 0.0% [kernel.kallsyms] [k] filemap_map_pages
10.6% 0.0% 0.0% 89.4% 0.0% [kernel.kallsyms] [k] smaps_page_accumulate
38.3% 0.0% 29.6% 27.1% 5.0% [kernel.kallsyms] [k] __schedule
Please see the description of each commit for other fields.
New mem_stat field was added to the hist_entry to save this
information. It's a generic data structure (array) to handle
different type of information like cache-level, memory location,
snoop-result, etc.
The first patch is a fix for the hierarchy mode and it was sent
separately. I just add it here not to break the hierarchy mode. The
second patch is to enable SAMPLE_DATA_SRC without SAMPLE_ADDR and
perf_event_attr.mmap_data which generate a lot more data.
The name of some new fields are the same as the corresponding sort
keys (mem, op, snoop) so I had to change the order whether it's
applied as an output field or a sort key. Maybe it's better to name
them differently but I couldn't come up with better ideas.
That means, you need to use -F/--fields option to specify those fields
and the sort keys you want. Maybe we can change the default output
and sort keys for perf mem report with this.
The code is available at 'perf/mem-field-v1' branch in
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
Namhyung Kim (11):
perf hist: Remove output field from sort-list properly
perf record: Add --sample-mem-info option
perf hist: Support multi-line header
perf hist: Add struct he_mem_stat
perf hist: Basic support for mem_stat accounting
perf hist: Implement output fields for mem stats
perf mem: Add 'op' output field
perf hist: Hide unused mem stat columns
perf mem: Add 'cache' and 'memory' output fields
perf mem: Add 'snoop' output field
perf mem: Add 'dtlb' output field
tools/perf/Documentation/perf-record.txt | 7 +-
tools/perf/builtin-record.c | 6 +
tools/perf/ui/browsers/hists.c | 50 ++++-
tools/perf/ui/hist.c | 272 ++++++++++++++++++++++-
tools/perf/ui/stdio/hist.c | 57 +++--
tools/perf/util/evsel.c | 2 +-
tools/perf/util/hist.c | 78 +++++++
tools/perf/util/hist.h | 22 ++
tools/perf/util/mem-events.c | 183 ++++++++++++++-
tools/perf/util/mem-events.h | 57 +++++
tools/perf/util/record.h | 1 +
tools/perf/util/sort.c | 42 +++-
12 files changed, 718 insertions(+), 59 deletions(-)
--
2.49.0.906.g1f30a19c02-goog
Powered by blists - more mailing lists