[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZXC1U8y4JAUaQ6lm@kernel.org>
Date: Wed, 6 Dec 2023 14:54:27 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Ayush Jain <ayush.jain3@....com>,
Sandipan Das <sandipan.das@....com>,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
peterz@...radead.org, Ingo Molnar <mingo@...nel.org>,
mark.rutland@....com, alexander.shishkin@...ux.intel.com,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, kjain@...ux.ibm.com,
atrajeev@...ux.vnet.ibm.com, barnali@...ux.ibm.com,
ananth.narayan@....com, ravi.bangoria@....com,
santosh.shukla@....com
Subject: Re: [PATCH] perf test: Retry without grouping for all metrics test
Em Wed, Dec 06, 2023 at 08:35:23AM -0800, Ian Rogers escreveu:
> On Wed, Dec 6, 2023 at 5:08 AM Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > Humm, I'm not being able to reproduce here the problem, before applying
> > this patch:
> Please don't apply the patch. The patch masks a bug in metrics/PMUs
I didn't
> and the proper fix was:
> 8d40f74ebf21 perf vendor events amd: Fix large metrics
> https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com
that is upstream:
⬢[acme@...lbox perf-tools-next]$ git log tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
commit 8d40f74ebf217d3b9e9b7481721e6236b857cc55
Author: Sandipan Das <sandipan.das@....com>
Date: Thu Jul 6 12:04:40 2023 +0530
perf vendor events amd: Fix large metrics
There are cases where a metric requires more events than the number of
available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
data fabric counters but the "nps1_die_to_dram" metric has eight events.
By default, the constituent events are placed in a group and since the
events cannot be scheduled at the same time, the metric is not computed.
The "all metrics" test also fails because of this.
Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
the user to run perf with "--metric-no-group".
E.g.
$ sudo perf test -v 101
Before:
101: perf all metrics test :
--- start ---
test child forked, pid 37131
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Metric 'nps1_die_to_dram' not printed in:
Error:
Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with -1
---- end ----
perf all metrics test: FAILED!
After:
101: perf all metrics test :
--- start ---
test child forked, pid 43766
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with 0
---- end ----
perf all metrics test: Ok
Reported-by: Ayush Jain <ayush.jain3@....com>
Suggested-by: Ian Rogers <irogers@...gle.com>
Signed-off-by: Sandipan Das <sandipan.das@....com>
Acked-by: Ian Rogers <irogers@...gle.com>
Cc: Adrian Hunter <adrian.hunter@...el.com>
Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Cc: Ananth Narayan <ananth.narayan@....com>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Jiri Olsa <jolsa@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>
Cc: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Ravi Bangoria <ravi.bangoria@....com>
Cc: Santosh Shukla <santosh.shukla@....com>
Link: https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com
> > Ian, I also stumbled on this:
> > [root@...e ~]# perf stat -M dram_channel_data_controller_4
> > Cannot find metric or group `dram_channel_data_controller_4'
> > ^C
> > Performance counter stats for 'system wide':
> > 284,908.91 msec cpu-clock # 32.002 CPUs utilized
> > 6,485,456 context-switches # 22.763 K/sec
> > 719 cpu-migrations # 2.524 /sec
> > 32,800 page-faults # 115.125 /sec
<SNIP>
> > I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no?
> We could. I suspect the code has always just not bailed out. I'll put
> together a patch adding the bail out.
Great, thanks,
- Arnaldo
Powered by blists - more mailing lists