lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXxMmn31iep6tdvaUGzZccR+_D1L4RbjaNiRdEau2NZ9g@mail.gmail.com>
Date: Mon, 13 Jan 2025 14:51:11 -0800
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	James Clark <james.clark@...aro.org>, Ze Gao <zegao2021@...il.com>, 
	Weilin Wang <weilin.wang@...el.com>, Dominique Martinet <asmadeus@...ewreck.org>, 
	Jean-Philippe Romain <jean-philippe.romain@...s.st.com>, Junhao He <hejunhao3@...wei.com>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org, 
	bpf@...r.kernel.org, Aditya Bodkhe <Aditya.Bodkhe1@....com>, 
	Atish Patra <atishp@...osinc.com>, Leo Yan <leo.yan@....com>, 
	Beeman Strong <beeman@...osinc.com>, Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH v5 4/4] perf parse-events: Reapply "Prefer sysfs/JSON
 hardware events over legacy"

On Mon, Jan 13, 2025 at 2:01 PM Namhyung Kim <namhyung@...nel.org> wrote:
>
> On Fri, Jan 10, 2025 at 02:15:18PM -0800, Ian Rogers wrote:
> > On Fri, Jan 10, 2025 at 11:40 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > On Thu, Jan 09, 2025 at 02:21:09PM -0800, Ian Rogers wrote:
> > > > Originally posted and merged from:
> > > > https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com
> > > > This reverts commit 4f1b067359ac8364cdb7f9fda41085fa85789d0f although
> > > > the patch is now smaller due to related fixes being applied in commit
> > > > 22a4db3c3603 ("perf evsel: Add alternate_hw_config and use in
> > > > evsel__match").
> > > > The original commit message was:
> > > >
> > > > It was requested that RISC-V be able to add events to the perf tool so
> > > > the PMU driver didn't need to map legacy events to config encodings:
> > > > https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/
> > > >
> > > > This change makes the priority of events specified without a PMU the
> > > > same as those specified with a PMU, namely sysfs and JSON events are
> > > > checked first before using the legacy encoding.
> > >
> > > I'm still not convinced why we need this change despite of these
> > > troubles.  If it's because RISC-V cannot define the lagacy hardware
> > > events in the kernel driver, why not using a different name in JSON and
> > > ask users to use the name specifically?  Something like:
> > >
> > >   $ perf record -e riscv-cycles ...
> >
> > So ARM and RISC-V are more than able to speak for themselves and have
> > their tags on the series, but let's recap why I'm motivated to do this
> > change:
> >
> > 1) perf supported legacy events;
> > 2) perf supported sysfs and json events, but at a lower priority than
> > legacy events;
> > 3) hybrid support was added but in a way where all the hybrid PMUs
> > needed to be known, assumptions about PMU were implicit and baked into
> > the tool;
> > 4) metric support for hybrid was going in a similar implicit direction
> > and I objected, what would cycles mean in a metric if the core PMU was
>
> If the legacy cycles event in a metric is a problem, can we change the
> metric to be more specific?
>
>
> > implicit? Rather than pursue this the hybrid code was overhauled, PMUs
> > became more of a thing and we added a notion of a "core" PMU which
> > would support legacy events;
> > 5) ARM core PMUs differ in naming, etc. than just about every other
> > platform. Their core events had been being programmed as if they were
> > uncore events - ie without the legacy priority. Fixing hybrid, and
> > fixing ARM PMUs to know they supported legacy events, broke perf on
> > Apple-M? series due to a PMU driver issue with legacy events:
> > https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/
> > "Perf broke on all Apple ARM64 systems (tested almost everything), and
> > according to maz also on Juno (so, probably all big.LITTLE) since
> > v6.5."
> > 6) sysfs/json events were made the priority over legacy to unbreak
> > perf on Apple-M? CPUs, but only if the PMU is specified:
> > https://lore.kernel.org/r/20231123042922.834425-1-irogers@google.com
> >    Reported-by: Hector Martin <marcan@...can.st>
> >    Signed-off-by: Ian Rogers <irogers@...gle.com>
> >    Tested-by: Hector Martin <marcan@...can.st>
> >    Tested-by: Marc Zyngier <maz@...nel.org>
> >    Acked-by: Mark Rutland <mark.rutland@....com>
>
> I think ARM/Apple-Mx is fine without this change, right?
>
> >
> > This gets us to the current code where I can trivially get an
> > inconsistency. Here on Intel with no PMU in the event name:
> > ```
> > $ perf stat -vv -e cpu-cycles true
> > Using CPUID GenuineIntel-6-8D-1
> > Control descriptor is not initialized
> > ------------------------------------------------------------
> > perf_event_attr:
> >   type                             0 (PERF_TYPE_HARDWARE)
> >   size                             136
> >   config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> >   sample_type                      IDENTIFIER
> >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> >   disabled                         1
> >   inherit                          1
> >   enable_on_exec                   1
> >   exclude_guest                    1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 752915  cpu -1  group_fd -1  flags 0x8 = 3
> > cpu-cycles: -1: 1293076 273429 273429
> > cpu-cycles: 1293076 273429 273429
> >
> >  Performance counter stats for 'true':
> >
> >          1,293,076      cpu-cycles
> >
> >        0.000809752 seconds time elapsed
> >
> >        0.000841000 seconds user
> >        0.000000000 seconds sys
> > ```
> >
> > Here with a PMU event name:
> > ```
> > $ sudo perf stat -vv -e cpu/cpu-cycles/ true
> > Using CPUID GenuineIntel-6-8D-1
> > Attempt to add: cpu/cpu-cycles=0/
> > ..after resolving event: cpu/event=0x3c/
> > Control descriptor is not initialized
> > ------------------------------------------------------------
> > perf_event_attr:
> >   type                             4 (cpu)
> >   size                             136
> >   config                           0x3c (cpu-cycles)
> >   sample_type                      IDENTIFIER
> >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> >   disabled                         1
> >   inherit                          1
> >   enable_on_exec                   1
> >   exclude_guest                    1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 752839  cpu -1  group_fd -1  flags 0x8 = 3
> > cpu/cpu-cycles/: -1: 1421235 531150 531150
> > cpu/cpu-cycles/: 1421235 531150 531150
> >
> >  Performance counter stats for 'true':
> >
> >          1,421,235      cpu/cpu-cycles/
> >
> >        0.001292908 seconds time elapsed
> >
> >        0.001340000 seconds user
> >        0.000000000 seconds sys
> > ```
> >
> > That is the no PMU event is opened as type=0/config=0 (legacy) while
> > the PMU event is opened as type=4/config=0x3c (sysfs encoding). Now
>
> I'm not sure it's a problem.  I think it works as expected...?
>
>
> > let's cross our fingers and hope that in the driver they are really
> > the same thing. I take objection to the idea that there should be two
> > different priorities for sysfs/json and legacy depending on whether a
> > PMU is or isn't specified in the event name. The priority could be
> > legacy then sysfs/json, or it could be sysfs/json then legacy, but it
> > should be the same regardless of whether the PMU is put in the event
>
> Well, I think having PMU name in the event is a big difference.  Legacy
> events were there since Day 1, I guess it's natural to think that an
> event without PMU name means a legacy event and others should come with
> PMU names explicitly.

So then we're breaking the event names by inserting a PMU name in
uniquify in the stat output:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/stat-display.c?h=perf-tools-next#n932

There was an explicit, and reviewed by Jiri and Arnaldo, intent with
the hybrid work that using a legacy event with a hybrid PMU, even
though the PMU doesn't advertise through json or sysfs the legacy
event, the perf tool supports it.

Making it so that events without PMUs are only legacy events just
doesn't work. There are far too many existing uses of non-legacy
events without PMU, the metrics contain 100s of examples.

Prior to switching json/sysfs to being the priority when a PMU is
specified, it was the case that all encodings were the same, with or
without a PMU.

I don't think there is anything natural about assuming things about
event names. Take cycles, cpu-cycles and cpu_cycles:
 - cycles on x86 is only encoded via a legacy event;
 - cpu-cycles on Intel exists as a sysfs event, but cpu-cycles is also
a legacy event name;
 - cpu_cycles exists as a sysfs event on ARM but doesn't have a
corresponding legacy event name.

The difference in meaning of an event name can be as subtle as the
difference between a hyphen and an underscore. Given that we can't
break everybody's `perf <command> -e <event name> ..` command name nor
should we break all the metrics, I think the most intuitive thing is
cycles behave the same with or without a PMU. For example, there may
be differences in accuracy between a fixed and generic counter and the
legacy event may only work with one counter because of this while the
sysfs/json event uses all the counters, or vice versa. As explained,
in output code the tool will or will not insert PMU names treating
them as not mattering. Currently they do matter as the parsing will
give different perf_event_attr and those can have differing kernel
behaviors. This patch fixes this.

Thanks,
Ian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ