lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fVYMK6tnKH0QU_RPUaogpsDmhmXn+=4P1uXg-moX2QMDw@mail.gmail.com>
Date: Fri, 10 Jan 2025 14:15:18 -0800
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	James Clark <james.clark@...aro.org>, Ze Gao <zegao2021@...il.com>, 
	Weilin Wang <weilin.wang@...el.com>, Dominique Martinet <asmadeus@...ewreck.org>, 
	Jean-Philippe Romain <jean-philippe.romain@...s.st.com>, Junhao He <hejunhao3@...wei.com>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org, 
	bpf@...r.kernel.org, Aditya Bodkhe <Aditya.Bodkhe1@....com>, 
	Atish Patra <atishp@...osinc.com>, Leo Yan <leo.yan@....com>, 
	Beeman Strong <beeman@...osinc.com>, Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH v5 4/4] perf parse-events: Reapply "Prefer sysfs/JSON
 hardware events over legacy"

On Fri, Jan 10, 2025 at 11:40 AM Namhyung Kim <namhyung@...nel.org> wrote:
>
> On Thu, Jan 09, 2025 at 02:21:09PM -0800, Ian Rogers wrote:
> > Originally posted and merged from:
> > https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com
> > This reverts commit 4f1b067359ac8364cdb7f9fda41085fa85789d0f although
> > the patch is now smaller due to related fixes being applied in commit
> > 22a4db3c3603 ("perf evsel: Add alternate_hw_config and use in
> > evsel__match").
> > The original commit message was:
> >
> > It was requested that RISC-V be able to add events to the perf tool so
> > the PMU driver didn't need to map legacy events to config encodings:
> > https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/
> >
> > This change makes the priority of events specified without a PMU the
> > same as those specified with a PMU, namely sysfs and JSON events are
> > checked first before using the legacy encoding.
>
> I'm still not convinced why we need this change despite of these
> troubles.  If it's because RISC-V cannot define the lagacy hardware
> events in the kernel driver, why not using a different name in JSON and
> ask users to use the name specifically?  Something like:
>
>   $ perf record -e riscv-cycles ...

So ARM and RISC-V are more than able to speak for themselves and have
their tags on the series, but let's recap why I'm motivated to do this
change:

1) perf supported legacy events;
2) perf supported sysfs and json events, but at a lower priority than
legacy events;
3) hybrid support was added but in a way where all the hybrid PMUs
needed to be known, assumptions about PMU were implicit and baked into
the tool;
4) metric support for hybrid was going in a similar implicit direction
and I objected, what would cycles mean in a metric if the core PMU was
implicit? Rather than pursue this the hybrid code was overhauled, PMUs
became more of a thing and we added a notion of a "core" PMU which
would support legacy events;
5) ARM core PMUs differ in naming, etc. than just about every other
platform. Their core events had been being programmed as if they were
uncore events - ie without the legacy priority. Fixing hybrid, and
fixing ARM PMUs to know they supported legacy events, broke perf on
Apple-M? series due to a PMU driver issue with legacy events:
https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/
"Perf broke on all Apple ARM64 systems (tested almost everything), and
according to maz also on Juno (so, probably all big.LITTLE) since
v6.5."
6) sysfs/json events were made the priority over legacy to unbreak
perf on Apple-M? CPUs, but only if the PMU is specified:
https://lore.kernel.org/r/20231123042922.834425-1-irogers@google.com
   Reported-by: Hector Martin <marcan@...can.st>
   Signed-off-by: Ian Rogers <irogers@...gle.com>
   Tested-by: Hector Martin <marcan@...can.st>
   Tested-by: Marc Zyngier <maz@...nel.org>
   Acked-by: Mark Rutland <mark.rutland@....com>

This gets us to the current code where I can trivially get an
inconsistency. Here on Intel with no PMU in the event name:
```
$ perf stat -vv -e cpu-cycles true
Using CPUID GenuineIntel-6-8D-1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0 (PERF_COUNT_HW_CPU_CYCLES)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 752915  cpu -1  group_fd -1  flags 0x8 = 3
cpu-cycles: -1: 1293076 273429 273429
cpu-cycles: 1293076 273429 273429

 Performance counter stats for 'true':

         1,293,076      cpu-cycles

       0.000809752 seconds time elapsed

       0.000841000 seconds user
       0.000000000 seconds sys
```

Here with a PMU event name:
```
$ sudo perf stat -vv -e cpu/cpu-cycles/ true
Using CPUID GenuineIntel-6-8D-1
Attempt to add: cpu/cpu-cycles=0/
..after resolving event: cpu/event=0x3c/
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
  type                             4 (cpu)
  size                             136
  config                           0x3c (cpu-cycles)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 752839  cpu -1  group_fd -1  flags 0x8 = 3
cpu/cpu-cycles/: -1: 1421235 531150 531150
cpu/cpu-cycles/: 1421235 531150 531150

 Performance counter stats for 'true':

         1,421,235      cpu/cpu-cycles/

       0.001292908 seconds time elapsed

       0.001340000 seconds user
       0.000000000 seconds sys
```

That is the no PMU event is opened as type=0/config=0 (legacy) while
the PMU event is opened as type=4/config=0x3c (sysfs encoding). Now
let's cross our fingers and hope that in the driver they are really
the same thing. I take objection to the idea that there should be two
different priorities for sysfs/json and legacy depending on whether a
PMU is or isn't specified in the event name. The priority could be
legacy then sysfs/json, or it could be sysfs/json then legacy, but it
should be the same regardless of whether the PMU is put in the event
name. The PMU in the event name should be optional, for example we may
or may not show it in the stat output. The encoding being consistent
was the case prior to the Apple-M? fix and this patch aims to make it
consistent once more. Given the ARM bug mentioned above it should also
fix specifying or not the PMU on Apple-M? CPUs as it will avoid the
same legacy event issue that may only exist on old kernels. RISC-V is
motivated because of not wanting hard coded legacy events in the
driver for all potential vendors and models.

Thanks,
Ian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ