linux-kernel - Re: [RFC PATCH v8 6/7] perf vendor events intel: Add MTL metric json files

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAP-5=fXDmewhEzZc5ZYhfHYtPUmELjeDTKM5m04PRFaAPaO+vg@mail.gmail.com>
Date: Thu, 16 May 2024 09:57:21 -0700
From: Ian Rogers <irogers@...gle.com>
To: weilin.wang@...el.com
Cc: Namhyung Kim <namhyung@...nel.org>, Arnaldo Carvalho de Melo <acme@...nel.org>, 
	Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Perry Taylor <perry.taylor@...el.com>, Samantha Alt <samantha.alt@...el.com>, 
	Caleb Biggers <caleb.biggers@...el.com>
Subject: Re: [RFC PATCH v8 6/7] perf vendor events intel: Add MTL metric json files

On Tue, May 14, 2024 at 10:44 PM <weilin.wang@...el.com> wrote:
>
> From: Weilin Wang <weilin.wang@...el.com>
>
> Add MTL metric json file at TMA4.7 [1]. Some of the metrics' formulas use TPEBS
> retire_latency in MTL.
>
> [1] https://lore.kernel.org/all/20240214011820.644458-1-irogers@google.com/
>
> Signed-off-by: Weilin Wang <weilin.wang@...el.com>
> Reviewed-by: Ian Rogers <irogers@...gle.com>

This change works either with the approach in this series or with the
evsel approach so I don't mind my reviewed-by standing. I'd prefer we
could have an evsel read counter implementation that returns 0 so that
we can run without retirement latency gathering.

TMA 4.7 is broken in that the tma_lock_latency metric uses a
retirement latency event but not within a max function so having the
read counter return 0 would break the metric:

+    {
+        "BriefDescription": "This metric represents fraction of
cycles the CPU spent handling cache misses due to lock operations",
+        "MetricExpr": "MEM_INST_RETIRED.LOCK_LOADS *
MEM_INST_RETIRED.LOCK_LOADS:R / tma_info_thread_clks",
+        "MetricGroup":
"Offcore;TopdownL4;tma_L4_group;tma_issueRFO;tma_l1_bound_group",
+        "MetricName": "tma_lock_latency",
+        "MetricThreshold": "tma_lock_latency > 0.2 & (tma_l1_bound >
0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2))",
+        "PublicDescription": "This metric represents fraction of
cycles the CPU spent handling cache misses due to lock operations. Due
to the microarchitecture handling of locks; they are classified as
L1_Bound regardless of what memory source satisfied them. Sample with:
MEM_INST_RETIRED.LOCK_LOADS_PS. Related metrics: tma_store_latency",
+        "ScaleUnit": "100%",
+        "Unit": "cpu_core"
+    },

Other metrics then use that metric specifically
tma_info_bottleneck_memory_data_tlbs and
tma_info_bottleneck_cache_memory_bandwidth.

I couldn't see in the TMA 4.8 release the updated MTL metrics:
https://github.com/intel/perfmon/pull/181/commits/d54c847b2f863c98a917bdd31a0680f4d50ff75c
but my belief is that this issue hasn't been addressed.

Thanks,
Ian