[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBTsMkY1nAVpIUQ4@x1>
Date: Fri, 2 May 2025 13:00:50 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Joe Mario <jmario@...hat.com>, Ian Rogers <irogers@...gle.com>,
Kan Liang <kan.liang@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
linux-perf-users@...r.kernel.org,
Ravi Bangoria <ravi.bangoria@....com>, Leo Yan <leo.yan@....com>
Subject: Re: [RFC/PATCHSET 00/11] perf mem: Add new output fields for data
source (v1)
On Wed, Apr 30, 2025 at 01:55:37PM -0700, Namhyung Kim wrote:
> Hello,
> The perf mem uses PERF_SAMPLE_DATA_SRC which has a lot of information
> for memory access. It has various sort keys to group related samples
> together but it's still cumbersome to see the result. While perf c2c
> command provides a way to investigate the data in a specific way, I'd
> like to add more generic ways using new output fields.
> For example, the following is the 'cache' output field which breaks
> down the sample weights into different level of caches.
Super cool!
> $ perf mem record -a sleep 1
>
> $ perf mem report -F cache,dso,sym --stdio
> ...
> #
> # -------------- Cache --------------
> # L1 L2 L3 L1-buf Other Shared Object Symbol
> # ................................... ..................................... .........................................
> #
> 0.0% 0.0% 0.0% 0.0% 100.0% [kernel.kallsyms] [k] ioread8
> 100.0% 0.0% 0.0% 0.0% 0.0% [kernel.kallsyms] [k] _raw_spin_lock_irq
> 0.0% 0.0% 0.0% 0.0% 100.0% [xhci_hcd] [k] xhci_update_erst_dequeue
> 0.0% 0.0% 0.0% 95.8% 4.2% [kernel.kallsyms] [k] smaps_account
> 0.6% 1.8% 22.7% 45.5% 29.5% [kernel.kallsyms] [k] sched_balance_update_blocked_averages
> 29.4% 0.0% 1.6% 58.8% 10.2% [kernel.kallsyms] [k] __update_load_avg_cfs_rq
> 0.0% 8.5% 4.3% 0.0% 87.2% [kernel.kallsyms] [k] copy_mc_enhanced_fast_string
> 63.9% 0.0% 8.0% 23.8% 4.3% [kernel.kallsyms] [k] psi_group_change
> 3.9% 0.0% 9.3% 35.7% 51.1% [kernel.kallsyms] [k] timerqueue_add
> 35.9% 10.9% 0.0% 39.0% 14.2% [kernel.kallsyms] [k] memcpy
> 94.1% 0.0% 0.0% 5.9% 0.0% [kernel.kallsyms] [k] unmap_page_range
> 25.7% 0.0% 4.9% 51.0% 18.4% [kernel.kallsyms] [k] __update_load_avg_se
> 0.0% 24.9% 19.4% 9.6% 46.1% [kernel.kallsyms] [k] _copy_to_iter
> 12.9% 0.0% 0.0% 87.1% 0.0% [kernel.kallsyms] [k] next_uptodate_folio
> 36.8% 0.0% 9.5% 16.6% 37.1% [kernel.kallsyms] [k] update_curr
> 100.0% 0.0% 0.0% 0.0% 0.0% bpf_prog_b9611ccbbb3d1833_dfs_iter [k] bpf_prog_b9611ccbbb3d1833_dfs_iter
> 45.4% 1.8% 20.4% 23.6% 8.8% [kernel.kallsyms] [k] audit_filter_rules.isra.0
> 92.8% 0.0% 0.0% 7.2% 0.0% [kernel.kallsyms] [k] filemap_map_pages
> 10.6% 0.0% 0.0% 89.4% 0.0% [kernel.kallsyms] [k] smaps_page_accumulate
> 38.3% 0.0% 29.6% 27.1% 5.0% [kernel.kallsyms] [k] __schedule
> Please see the description of each commit for other fields.
> New mem_stat field was added to the hist_entry to save this
> information. It's a generic data structure (array) to handle
> different type of information like cache-level, memory location,
> snoop-result, etc.
> The first patch is a fix for the hierarchy mode and it was sent
> separately. I just add it here not to break the hierarchy mode. The
> second patch is to enable SAMPLE_DATA_SRC without SAMPLE_ADDR and
> perf_event_attr.mmap_data which generate a lot more data.
I merged it and added a test for the hierachy mode as mentioned in my
reply to that patch.
> The name of some new fields are the same as the corresponding sort
> keys (mem, op, snoop) so I had to change the order whether it's
> applied as an output field or a sort key. Maybe it's better to name
> them differently but I couldn't come up with better ideas.
Looks ok at first sight.
> That means, you need to use -F/--fields option to specify those fields
> and the sort keys you want. Maybe we can change the default output
> and sort keys for perf mem report with this.
Maybe we can come up with aliases to help using these new features
without having to create a long command line, maybe:
perf cache
Or some other more suitable name.
That would just be translated into the long command line for 'perf
report', kinda like 'perf kvm', but maybe we can do it like with 'perf
archive', i.e. just a shell wrapper?
> The code is available at 'perf/mem-field-v1' branch in
I'll test it, and I'm CCing Joe Mario, who I think will be very much
interesting in trying this!
- Arnaldo
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> Thanks,
> Namhyung
> Namhyung Kim (11):
> perf hist: Remove output field from sort-list properly
> perf record: Add --sample-mem-info option
> perf hist: Support multi-line header
> perf hist: Add struct he_mem_stat
> perf hist: Basic support for mem_stat accounting
> perf hist: Implement output fields for mem stats
> perf mem: Add 'op' output field
> perf hist: Hide unused mem stat columns
> perf mem: Add 'cache' and 'memory' output fields
> perf mem: Add 'snoop' output field
> perf mem: Add 'dtlb' output field
>
> tools/perf/Documentation/perf-record.txt | 7 +-
> tools/perf/builtin-record.c | 6 +
> tools/perf/ui/browsers/hists.c | 50 ++++-
> tools/perf/ui/hist.c | 272 ++++++++++++++++++++++-
> tools/perf/ui/stdio/hist.c | 57 +++--
> tools/perf/util/evsel.c | 2 +-
> tools/perf/util/hist.c | 78 +++++++
> tools/perf/util/hist.h | 22 ++
> tools/perf/util/mem-events.c | 183 ++++++++++++++-
> tools/perf/util/mem-events.h | 57 +++++
> tools/perf/util/record.h | 1 +
> tools/perf/util/sort.c | 42 +++-
> 12 files changed, 718 insertions(+), 59 deletions(-)
>
> --
> 2.49.0.906.g1f30a19c02-goog
Powered by blists - more mailing lists