lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <07ee39aa-309f-4414-aee0-cc5b86a66af7@linux.intel.com>
Date: Wed, 6 Nov 2024 11:46:56 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
 linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
 Perry Taylor <perry.taylor@...el.com>, Samantha Alt
 <samantha.alt@...el.com>, Caleb Biggers <caleb.biggers@...el.com>,
 Weilin Wang <weilin.wang@...el.com>, Edward Baker <edward.baker@...el.com>
Subject: Re: [PATCH v4 00/22] Python generated Intel metrics



On 2024-10-09 12:02 p.m., Ian Rogers wrote:
> On Fri, Sep 27, 2024 at 11:34 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>
>>
>>
>> On 2024-09-26 1:50 p.m., Ian Rogers wrote:
>>> Generate twenty sets of additional metrics for Intel. Rapl and Idle
>>> metrics aren't specific to Intel but are placed here for ease and
>>> convenience. Smi and tsx metrics are added so they can be dropped from
>>> the per model json files.
>>
>> Are Smi and tsx metrics the only two metrics who's duplicate metrics in
>> the json files will be dropped?
> 
> Yes. These metrics with their runtime detection and use of sysfs event
> names I feel more naturally fit here rather than in the Intel perfmon
> github converter script.
> 
>> It sounds like there will be many duplicate metrics in perf list, right?
> 
> That's not the goal. There may be memory bandwidth computed in
> different ways, like TMA and using uncore, but that seems okay as the
> metrics are using different counters so may say different things. I
> think there is an action to always watch the metrics and ensure
> duplicates don't occur, but some duplication can be beneficial.


Can we give a common prefix for all the automatically generated metrics,
e.g., general_ or std_?
As you said, there may be different metrics to calculate the same thing.

With a common prefix, we can clearly understand where the metrics is
from. In case, there are any issues found later for some metrics. I can
tell the end user to use either the TMA metrics or the automatically
generated metrics.
If they count the same thing, the main body of the metric name should be
the same.

Thanks,
Kan

> 
>> Also, is it an attempt to define some architectural metrics for perf?
> 
> There are many advantages of using python to generate the metric json,
> a few are:
> 1) we verify the metrics use events from the event json,
> 2) the error prone escaping of commas and slashes is handled by the python,
> 3) metric expressions can be spread over multiple lines and have comments.
> It is also an advantage that we can avoid copy-pasting one metric from
> one architectural metric json to another. This helps propagate fixes.
> 
> So, it's not so much a goal to have architectural metrics but its nice
> that we avoid copy-paste. Somewhere where I've tried to set up common
> events across all architectures is with making tool have its own PMU.
> Rather than have the tool PMU describe events using custom code it
> just reuses the existing PMU json support:
> https://github.com/googleprodkernel/linux-perf/blob/google_tools_master/tools/perf/pmu-events/arch/common/common/tool.json
> 
>> How do you decide which metrics should be added here?
> 
> The goal is to try to make open source metrics that Google has
> internally. I've set up a git repo for this here:
> https://github.com/googleprodkernel/linux-perf
> Often the source of the metric is Intel's documentation on things like
> uncore events, it's just such metrics aren't part of the perfmon
> process and so we're adding them here. Were all these metrics on the
> Intel github it'd be reasonable to remove them from here. If Intel
> would like to work on or contribute some metrics here, that's also
> fine. I think the main thing is to be giving users useful metrics.
> 
> Thanks,
> Ian
> 
>>> There are four uncore sets of metrics and
>>> eleven core metrics. Add a CheckPmu function to metric to simplify
>>> detecting the presence of hybrid PMUs in events. Metrics with
>>> experimental events are flagged as experimental in their description.
>>>
>>> The patches should be applied on top of:
>>> https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
>>>
>>> v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
>>> v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
>>>     minor code cleanup changes. Drop reference to merged fix for
>>>     umasks/occ_sel in PCU events and for cstate metrics.
>>> v2. Drop the cycles breakdown in favor of having it as a common
>>>     metric, spelling and other improvements suggested by Kan Liang
>>>     <kan.liang@...ux.intel.com>.
>>>
>>> Ian Rogers (22):
>>>   perf jevents: Add RAPL metrics for all Intel models
>>>   perf jevents: Add idle metric for Intel models
>>>   perf jevents: Add smi metric group for Intel models
>>>   perf jevents: Add CheckPmu to see if a PMU is in loaded json events
>>>   perf jevents: Mark metrics with experimental events as experimental
>>>   perf jevents: Add tsx metric group for Intel models
>>>   perf jevents: Add br metric group for branch statistics on Intel
>>>   perf jevents: Add software prefetch (swpf) metric group for Intel
>>>   perf jevents: Add ports metric group giving utilization on Intel
>>>   perf jevents: Add L2 metrics for Intel
>>>   perf jevents: Add load store breakdown metrics ldst for Intel
>>>   perf jevents: Add ILP metrics for Intel
>>>   perf jevents: Add context switch metrics for Intel
>>>   perf jevents: Add FPU metrics for Intel
>>>   perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
>>>   perf jevents: Add mem_bw metric for Intel
>>>   perf jevents: Add local/remote "mem" breakdown metrics for Intel
>>>   perf jevents: Add dir breakdown metrics for Intel
>>>   perf jevents: Add C-State metrics from the PCU PMU for Intel
>>>   perf jevents: Add local/remote miss latency metrics for Intel
>>>   perf jevents: Add upi_bw metric for Intel
>>>   perf jevents: Add mesh bandwidth saturation metric for Intel
>>>
>>>  tools/perf/pmu-events/intel_metrics.py | 1046 +++++++++++++++++++++++-
>>>  tools/perf/pmu-events/metric.py        |   52 ++
>>>  2 files changed, 1095 insertions(+), 3 deletions(-)
>>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ