lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b5821a8375372fddd16ee6b53bf4cd218bdf9a8b.camel@intel.com>
Date: Fri, 17 Jan 2025 20:30:12 +0000
From: "Falcon, Thomas" <thomas.falcon@...el.com>
To: "Biggers, Caleb" <caleb.biggers@...el.com>, "irogers@...gle.com"
	<irogers@...gle.com>, "kan.liang@...ux.intel.com" <kan.liang@...ux.intel.com>
CC: "alexander.shishkin@...ux.intel.com" <alexander.shishkin@...ux.intel.com>,
	"mpetlan@...hat.com" <mpetlan@...hat.com>, "Taylor, Perry"
	<perry.taylor@...el.com>, "Hunter, Adrian" <adrian.hunter@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-perf-users@...r.kernel.org" <linux-perf-users@...r.kernel.org>,
	"mingo@...hat.com" <mingo@...hat.com>, "manivannan.sadhasivam@...aro.org"
	<manivannan.sadhasivam@...aro.org>, "Alt, Samantha" <samantha.alt@...el.com>,
	"peterz@...radead.org" <peterz@...radead.org>, "Wang, Weilin"
	<weilin.wang@...el.com>, "Baker, Edward" <edward.baker@...el.com>,
	"acme@...nel.org" <acme@...nel.org>, "afaerber@...e.de" <afaerber@...e.de>,
	"jolsa@...nel.org" <jolsa@...nel.org>, "namhyung@...nel.org"
	<namhyung@...nel.org>, "mark.rutland@....com" <mark.rutland@....com>
Subject: Re: [PATCH v2 00/23] Intel vendor events and TMA 5.01 metrics

On Fri, 2025-01-17 at 11:43 -0800, Ian Rogers wrote:
> On Fri, Jan 17, 2025 at 11:10 AM Liang, Kan
> <kan.liang@...ux.intel.com> wrote:
> > 
> > 
> > 
> > On 2025-01-17 11:03 a.m., Liang, Kan wrote:
> > > 
> > > 
> > > On 2025-01-16 1:43 a.m., Ian Rogers wrote:
> > > > Update the Intel vendor events to the latest.
> > > > Update the metrics to TMA 5.01.
> > > > Add Arrowlake and Clearwaterforest support.
> > > > Add metrics for LNL and GNR.
> > > > Address IIO uncore issue spotted on EMR, GRR, GNR, SPR and SRF.
> > > > 
> > > > The perf json was generated using the script:
> > > > https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
> > > > with the generated json being in:
> > > > https://github.com/intel/perfmon/tree/main/scripts/perf
> > > > 
> > > > Thanks to Perry Taylor <perry.taylor@...el.com>, Caleb Biggers
> > > > <caleb.biggers@...el.com>, Edward Baker
> > > > <edward.baker@...el.com> and
> > > > Weilin Wang <weilin.wang@...el.com> for helping get this patch
> > > > series
> > > > together.
> > > > 
> > > > v2: Fix hybrid and Co-authored-by tag issues reported by
> > > >     Arnaldo. Updates to Lunarlake and Meteorlake events.
> > > > Addition of
> > > >     Clearwaterforest.
> > > 
> > > Thanks Ian!
> > > 
> > > Acked-by: Kan Liang <kan.liang@...ux.intel.com>
> > > 
> > 
> > Thanks Thomas to do more tests for the series.
> > 
> > There is an issue for the FP_ARITH* related metrics on hybrid
> > platforms.
> > I have to take the acked-by back. Sorry for the noise.
> > 
> > Here is the issue on ADL and ARL.
> > 
> > $ sudo ./perf stat -M tma_info_inst_mix_iparith -a sleep 1
> > Cannot resolve IDs for tma_info_inst_mix_iparith: INST_RETIRED.ANY
> > /
> > (FP_ARITH_INST_RETIRED.SCALAR + FP_ARITH_INST_RETIRED.VECTOR)
> > 
> > 
> > The patch set add the tma_info_inst_mix_iparith for cpu_atom.
> > 
> > +    {
> > +        "BriefDescription": "Instructions per FP Arithmetic
> > instruction
> > (lower number means higher occurrence rate)",
> > +        "MetricExpr": "INST_RETIRED.ANY /
> > (FP_ARITH_INST_RETIRED.SCALAR
> > + FP_ARITH_INST_RETIRED.VECTOR)",
> > +        "MetricGroup": "Flops;InsType;Inst_Metric",
> > +        "MetricName": "tma_info_inst_mix_iparith",
> > +        "MetricThreshold": "tma_info_inst_mix_iparith < 10",
> > +        "PublicDescription": "Instructions per FP Arithmetic
> > instruction (lower number means higher occurrence rate). Values < 1
> > are
> > possible due to intentional FMA double counting. Approximated prior
> > to BDW",
> > +        "Unit": "cpu_atom"
> > +    },
> > 
> > However, the FP_ARITH_INST_RETIRED.SCALAR and
> > FP_ARITH_INST_RETIRED.VECTOR event are only available for cpu_core.
> > 
> > sudo ./perf stat -e FP_ARITH_INST_RETIRED.SCALAR -a sleep 1
> > 
> >  Performance counter stats for 'system wide':
> > 
> >                  0      cpu_core/FP_ARITH_INST_RETIRED.SCALAR/
> > 
> > There should be no such metric for cpu_atom.
> 
> Thanks Thomas and Kan!
> 
> The metric came from here:
> https://github.com/intel/perfmon/blob/main/ADL/metrics/perf/alderlake_metrics_goldencove_core_perf.json#L1243
> and the event being in the cpu_core will explain why it passed the
> sanity check that all events are in the event json.
> 
> I believe Caleb can address the issue. I think all the events need
> PMU
> prefixes as happened previously here:
> https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py#L1535
> 
> A few more kinks to resolve in the new TMA release process, thanks
> for
> the testing!
> Ian


Thanks, I'm also seeing some weirdness in the perf all metrics and
metricsgroups tests with verbose output:

.../perf-tools-next/tools/perf# ./perf test 95
 95: perf all metrics test                                           :
Skip
.../perf-tools-next/tools/perf# ./perf test 95 -vvv
 95: perf all metrics test:
--- start ---
test child forked, pid 135502
Testing tma_core_bound

...

Testing tma_fp_scalar
FP issues
Cannot resolve IDs for tma_fp_scalar: FP_ARITH_INST_RETIRED.SCALAR /
(tma_retiring arch bench Build builtin-annotate.c builtin-annotate.o
builtin-bench.c builtin-bench.o builtin-buildid-cache.c builtin-
buildid-cache.o builtin-buildid-list.c builtin-buildid-list.o builtin-
c2c.c builtin-c2c.o builtin-check.c builtin-check.o builtin-config.c
builtin-config.o builtin-daemon.c builtin-daemon.o builtin-data.c
builtin-data.o builtin-diff.c builtin-diff.o builtin-evlist.c builtin-
evlist.o builtin-ftrace.c builtin-ftrace.o builtin.h builtin-help.c
builtin-help.o builtin-inject.c builtin-inject.o builtin-kallsyms.c
builtin-kallsyms.o builtin-kmem.c builtin-kmem.o builtin-kvm.c builtin-
kvm.o builtin-kwork.c builtin-kwork.o builtin-list.c builtin-list.o
builtin-lock.c builtin-lock.o builtin-mem.c builtin-mem.o builtin-
probe.c builtin-probe.o builtin-record.c builtin-record.o builtin-
report.c builtin-report.o builtin-sched.c builtin-sched.o builtin-
script.c builtin-script.o builtin-stat.c builtin-stat.o builtin-
timechart.c builtin-timechart.o builtin-top.c builtin-top.o builtin-
trace.c builtin-trace.o builtin-version.c builtin-version.o check-
header_ignore_hunks check-headers.sh check-headers.sh.shellcheck_log
command-list.txt common-cmds.h CREDITS design.txt dlfilters
Documentation FEATURE-DUMP include jvmti libapi libbpf libperf libperf-
bench.a libperf-jvmti.so libperf-test.a libperf-ui.a libperf-util.a
libpmu-events.a libsubcmd libsymbol Makefile Makefile.config
Makefile.perf MANIFEST perf perf-archive perf-archive.sh perf-
archive.sh.shellcheck_log perf-bench-in.o perf.c perf-completion.sh
perf-completion.sh.shellcheck_log perf.h perf-in.o perf-iostat perf-
iostat.sh perf-iostat.sh.shellcheck_log perf.o perf-read-vdso32 perf-
read-vdso.c perf-sys.h perf-test-in.o perf-ui-in.o perf-util-in.o PERF-
VERSION-FILE pmu-events python python_ext_build scripts tests trace ui
util tma_info_thread_slots)

...

---- end(-2) ----
 95: perf all metrics test                                           :
Skip

The * is getting replaced with the files in my working directory, which
is tools/perf, but only when running 'perf test', running 'perf stat'
just prints the error normally:

.../perf-tools-next/tools/perf# ./perf stat -M tma_fp_scalar sleep 1
Cannot resolve IDs for tma_fp_scalar: FP_ARITH_INST_RETIRED.SCALAR /
(tma_retiring * tma_info_thread_slots)

Tom

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ