lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 1 Mar 2024 08:37:53 -0800
From: Ian Rogers <irogers@...gle.com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>
Cc: Perry Taylor <perry.taylor@...el.com>, Samantha Alt <samantha.alt@...el.com>, 
	Caleb Biggers <caleb.biggers@...el.com>, Weilin Wang <weilin.wang@...el.com>, 
	Edward Baker <edward.baker@...el.com>, Andi Kleen <ak@...ux.intel.com>, 
	Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, John Garry <john.g.garry@...cle.com>, 
	Jing Zhang <renyu.zj@...ux.alibaba.com>, Thomas Richter <tmricht@...ux.ibm.com>, 
	James Clark <james.clark@....com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org, Stephane Eranian <eranian@...gle.com>
Subject: Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models

On Fri, Mar 1, 2024 at 6:52 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>
>
>
> On 2024-02-29 8:01 p.m., Ian Rogers wrote:
> > On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <kan.liang@...ux.intel.com> wrote:
> >>
> >>
> >>
> >> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> >>> Allow duplicated metric to be dropped from json files.
> >>>
> >>> Signed-off-by: Ian Rogers <irogers@...gle.com>
> >>> ---
> >>>  tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
> >>>  1 file changed, 51 insertions(+)
> >>>
> >>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> >>> index 20c25d142f24..1096accea2aa 100755
> >>> --- a/tools/perf/pmu-events/intel_metrics.py
> >>> +++ b/tools/perf/pmu-events/intel_metrics.py
> >>> @@ -7,6 +7,7 @@ import argparse
> >>>  import json
> >>>  import math
> >>>  import os
> >>> +from typing import Optional
> >>>
> >>>  parser = argparse.ArgumentParser(description="Intel perf json generator")
> >>>  parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
> >>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
> >>>      ])
> >>>
> >>>
> >>> +def Tsx() -> Optional[MetricGroup]:
> >>> +    if args.model not in [
> >>> +        'alderlake',
> >>> +        'cascadelakex',
> >>> +        'icelake',
> >>> +        'icelakex',
> >>> +        'rocketlake',
> >>> +        'sapphirerapids',
> >>> +        'skylake',
> >>> +        'skylakex',
> >>> +        'tigerlake',> +    ]:
> >>
> >> Can we get ride of the model list? Otherwise, we have to keep updating
> >> the list.
> >
> > Do we expect the list to update? :-)
>
> Yes, at least for the meteorlake and graniterapids. They should be the
> same as alderlake and sapphirerapids. I'm not sure about the future
> platforms.
>
> Maybe we can have a if args.model in list here to include all the
> non-hybrid models which doesn't support TSX. I think the list should not
> be changed shortly.
>
> > The issue is the events are in
> > sysfs and not the json. If we added the tsx events to json then this
> > list wouldn't be necessary, but it also would mean the events would be
> > present in "perf list" even when TSX is disabled.
>
> I think there may an alternative way, to check the RTM events, e.g.,
> RTM_RETIRED.START event. We only need to generate the metrics for the
> platform which supports the RTM_RETIRED.START event.
>
>
> >
> >>> +        return None
> >>> +> +    pmu = "cpu_core" if args.model == "alderlake" else "cpu"
> >>
> >> Is it possible to change the check to the existence of the "cpu" PMU
> >> here? has_pmu("cpu") ? "cpu" : "cpu_core"
> >
> > The "Unit" on "cpu" events in json always just blank. On hybrid it is
> > either "cpu_core" or "cpu_atom", so I can make this something like:
> >
> > pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
> >
> > which would be a build time test.
>
> Yes, I think using the "Unit" is good enough.
>
> >
> >
> >>> +    cycles = Event('cycles')
> >>> +    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> >>> +    transaction_start = Event(f'{pmu}/tx\-start/')
> >>> +    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> >>> +    metrics = [
> >>> +        Metric('tsx_transactional_cycles',
> >>> +                      'Percentage of cycles within a transaction region.',
> >>> +                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> >>> +                      '100%'),
> >>> +        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
> >>> +                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> >>> +                                    has_event(cycles_in_tx),
> >>> +                                    0),
> >>> +                      '100%'),
> >>> +        Metric('tsx_cycles_per_transaction',
> >>> +                      'Number of cycles within a transaction divided by the number of transactions.',
> >>> +                      Select(cycles_in_tx / transaction_start,
> >>> +                                    has_event(cycles_in_tx),
> >>> +                                    0),
> >>> +                      "cycles / transaction"),
> >>> +    ]
> >>> +    if args.model != 'sapphirerapids':
> >>
> >> Add the "tsx_cycles_per_elision" metric only if
> >> has_event(f'{pmu}/el\-start/')?
> >
> > It's a sysfs event, so this wouldn't work :-(
>
> The below is the definition of el-start in the kernel.
> EVENT_ATTR_STR(el-start,        el_start,       "event=0xc8,umask=0x1");
>
> The corresponding event in the event list should be HLE_RETIRED.START
>       "EventCode": "0xC8",
>       "UMask": "0x01",
>       "EventName": "HLE_RETIRED.START",
>
> I think we may check the HLE_RETIRED.START instead. If the
> HLE_RETIRED.START doesn't exist, I don't see a reason why the
> tsx_cycles_per_elision should be supported.
>
> Again, in the virtualization world, it's possible that the
> HLE_RETIRED.START exists in the event list but el_start isn't available
> in the sysfs. I think it has to be specially handle in the test as well.

So we keep the has_event test on the sysfs event to handle the
virtualization and disabled case. We use  HLE_RETIRED.START to detect
whether the model supports TSX. Should the event be the sysfs or json
version? i.e.

        "MetricExpr": "(cycles\\-t / el\\-start if
has_event(el\\-start) else 0)",

or

        "MetricExpr": "(cycles\\-t / HLE_RETIRED.START if
has_event(el\\-start) else 0)",

I think I favor the former for some consistency with the has_event.

Using HLE_RETIRED.START means the set of TSX models goes from:
        'alderlake',
        'cascadelakex',
        'icelake',
        'icelakex',
        'rocketlake',
        'sapphirerapids',
        'skylake',
        'skylakex',
        'tigerlake',

To:
   broadwell
   broadwellde
   broadwellx
   cascadelakex
   haswell
   haswellx
   icelake
   rocketlake
   skylake
   skylakex

Using RTM_RETIRED.START it goes to:
   broadwell
   broadwellde
   broadwellx
   cascadelakex
   emeraldrapids
   graniterapids
   haswell
   haswellx
   icelake
   icelakex
   rocketlake
   sapphirerapids
   skylake
   skylakex
   tigerlake

So I'm not sure it is working equivalently to what we have today,
which may be good or bad. Here is what I think the code should look
like:

def Tsx() -> Optional[MetricGroup]:
  pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
  cycles = Event('cycles')
  cycles_in_tx = Event(f'{pmu}/cycles\-t/')
  cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
  try:
    # Test if the tsx event is present in the json, prefer the
    # sysfs version so that we can detect its presence at runtime.
    transaction_start = Event("RTM_RETIRED.START")
    transaction_start = Event(f'{pmu}/tx\-start/')
  except:
    return None

  elision_start = None
  try:
    # Elision start isn't supported by all models, but we'll not
    # generate the tsx_cycles_per_elision metric in that
    # case. Again, prefer the sysfs encoding of the event.
    elision_start = Event("HLE_RETIRED.START")
    elision_start = Event(f'{pmu}/el\-start/')
  except:
    pass

  return MetricGroup('transaction', [
      Metric('tsx_transactional_cycles',
             'Percentage of cycles within a transaction region.',
             Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
             '100%'),
      Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted
transactions.',
             Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
                    has_event(cycles_in_tx),
                    0),
             '100%'),
      Metric('tsx_cycles_per_transaction',
             'Number of cycles within a transaction divided by the
number of transactions.',
             Select(cycles_in_tx / transaction_start,
                    has_event(cycles_in_tx),
                    0),
             "cycles / transaction"),
      Metric('tsx_cycles_per_elision',
             'Number of cycles within a transaction divided by the
number of elisions.',
             Select(cycles_in_tx / elision_start,
                    has_event(elision_start),
                    0),
             "cycles / elision") if elision_start else None,
  ], description="Breakdown of transactional memory statistics")

Wdyt?

Thanks,
Ian

> Thanks,
> Kan
>
> >
> > Thanks,
> > Ian
> >
> >> Thanks,
> >> Kan
> >>
> >>> +        elision_start = Event(f'{pmu}/el\-start/')
> >>> +        metrics += [
> >>> +            Metric('tsx_cycles_per_elision',
> >>> +                          'Number of cycles within a transaction divided by the number of elisions.',
> >>> +                          Select(cycles_in_tx / elision_start,
> >>> +                                        has_event(elision_start),
> >>> +                                        0),
> >>> +                          "cycles / elision"),
> >>> +        ]
> >>> +    return MetricGroup('transaction', metrics)
> >>> +
> >>> +
> >>>  all_metrics = MetricGroup("", [
> >>>      Idle(),
> >>>      Rapl(),
> >>>      Smi(),
> >>> +    Tsx(),
> >>>  ])
> >>>
> >>>  if args.metricgroups:
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ