lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXDdcjMmn8iBenjPmZZQdB=AX+gc4TYDsHXuwH9TYq4Ng@mail.gmail.com>
Date: Fri, 24 May 2024 20:47:59 -0700
From: Ian Rogers <irogers@...gle.com>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Leo Yan <leo.yan@...ux.dev>, 
	Mark Rutland <mark.rutland@....com>, Ingo Molnar <mingo@...nel.org>, 
	Thomas Gleixner <tglx@...utronix.de>, Jiri Olsa <jolsa@...nel.org>, 
	Namhyung Kim <namhyung@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>, 
	Clark Williams <williams@...hat.com>, Kate Carcia <kcarcia@...hat.com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org, Anne Macedo <retpolanne@...teo.net>, 
	Bhaskar Chowdhury <unixbhaskar@...il.com>, Ethan Adams <j.ethan.adams@...il.com>, 
	James Clark <james.clark@....com>, Kan Liang <kan.liang@...ux.intel.com>, 
	Thomas Richter <tmricht@...ux.ibm.com>, Tycho Andersen <tycho@...ho.pizza>, 
	Yang Jihong <yangjihong@...edance.com>
Subject: Re: [GIT PULL] perf tools changes for v6.10

On Fri, May 24, 2024 at 7:02 PM Arnaldo Carvalho de Melo
<acme@...nel.org> wrote:
>
> On Fri, May 24, 2024 at 10:55:11PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Fri, May 24, 2024 at 06:31:52PM -0700, Linus Torvalds wrote:
> > > On Tue, 21 May 2024 at 12:26, Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > > >
> > > > perf tools fixes and improvements for v6.10:
> > >
> > > This actually broke 'perf' completely for me on arm64.
> > >
> > > With a 6.9 version of 'perf', I can do this:
> > >
> > >     perf record -e cycles:pp make -j199
> > >
> > > and it all works fine.
> > >
> > > With the current -git version, when I do the same, I instead get
> > >
> > >   Error:
> > >   cycles:pp: PMU Hardware doesn't support
> > > sampling/overflow-interrupts. Try 'perf stat'
> > >
> > > and after trying desperately to chase down what went wrong on the
> > > kernel side, I finally figured out that it wasn't a kernel change at
> > > all, it was the tooling that had changed.
> > >
> > > I did a 'git bisect', and it says
> > >
> > >   617824a7f0f73e4de325cf8add58e55b28c12493 is the first bad commit
> > >   commit 617824a7f0f73e4de325cf8add58e55b28c12493
> > >   Author: Ian Rogers <irogers@...gle.com>
> > >   Date:   Mon Apr 15 23:15:25 2024 -0700
> > >
> > >       perf parse-events: Prefer sysfs/JSON hardware events over legacy
> > >
> > > and very clearly this does *NOT* work at all for me.
> > >
> > > I didn't notice until now, simply because I had been busy with the
> > > merge window, so I hadn't been doing any profiles, but the merge
> > > window is calming down and the end is nigh, and I just wasted more
> > > time than I care to admit trying to figure out what went wrong in the
> > > kernel.
> > >
> > > And no, I don't speak JSON, and I have *no* idea what the legacy
> > > events are. Plus I'm not very familiar with the arm64 profiling etc
> > > anyway, so I'm just a clueless user here.
> > >
> > > I *can* confirm that just reverting that commit makes that trivial
> > > "perf record" work for me. So the bisect was good, and it reverts
> > > cleanly, but I don't know _why_ my arm64 setup hates it so much.

Thanks for the report. TL;DR try putting the PMU name with the event
name, so "cycles:pp" becomes "armv8_pmuv3_0/cycles/pp", where
armv8_pmuv3_0 is the name of the PMU from /sys/devices.

The perf tool has a number of inbuilt "legacy" event names, cycles is
one. Most events these days are either found in sysfs (this is on a
Raspberry Pi 5):
```
$ ls /sys/devices/armv8_cortex_a76/events/
br_mis_pred          exc_return        l1d_tlb_refill      l2d_tlb
        remote_access
br_mis_pred_retired  exc_taken         l1i_cache
l2d_tlb_refill      stall_backend
br_pred              inst_retired      l1i_cache_refill    l3d_cache
        stall_frontend
br_retired           inst_spec         l1i_tlb
l3d_cache_allocate  sw_incr
bus_access           itlb_walk         l1i_tlb_refill
l3d_cache_refill    ttbr_write_retired
bus_cycles           l1d_cache         l2d_cache           ll_cache_miss_rd
cid_write_retired    l1d_cache_refill  l2d_cache_allocate  ll_cache_rd
cpu_cycles           l1d_cache_wb      l2d_cache_refill    mem_access
dtlb_walk            l1d_tlb           l2d_cache_wb        memory_error
```
 or json that is turned into tables built into the perf tool.

ARM requested that sysfs and json events take priority over legacy
events. So 'armv8_cortex_a76/cycles/' should first try to open an
event in either sysfs or json, and if not present fallback on using
the legacy constants. Note that the event name has the PMU name first.
What should the behavior of the 'cycles' event with no PMU be? In
Linux 6.9 the behavior was that cycles without a PMU would be a legacy
encoding and only try to be on the core's PMU (generally cpu on Intel
or armv8... on ARM), but with a PMU it would prefer sysfs and json
tables.

RISC-V had asked that in the no PMU case they'd also like sysfs/json
to have priority so the driver could be more ignorant of event
encodings. It was also inconsistent that in Linux 6.9:

$ perf stat -e inst_retired.any true

was a sysfs/json event that we'd try to open on every PMU but:

$ perf stat -e instructions true

was a legacy event that would only be opened on the core PMU. (I'm
ignoring the complexity that BIG.little/hybrid adds).

The blamed patch does away with the inconsistency and makes it so that
legacy events are always the 2nd choice and we try to open every event
on every PMU. It is the 2nd point that I think is breaking you. I
think you have a PMU on your system with a cycles event, maybe an
uncore PMU with memory controller data or CXL, and the perf record is
trying to open the cycles event on that and the error message
correctly reports that sampling wouldn't be possible on that PMU.

Putting verbose flags on perf record:

$ perf record -vv ...

should hopefully give more breadcrumbs and confirm this. You could also do:

$ ls /sys/devices/*/events/cycles

If there are more than 1 sysfs cycles event then probably one of them
is the problem. Adding the PMU name removes the trying every PMU
behavior and so should be a fix. We could also change it so that when
we open multiple events for perf record we don't fail in a case like
this. Maybe it is a feature to fail.

Thanks,
Ian

> > That is a good data point, we probably could go with the revert, but I
> > think Ian submitted a few patches fixing this issue that came up close
> > to LSFMM/BPF and the merge window, so didn't have time to sit on
> > linux-next for a while, I'm looking those up now.
>
> Couldn't find it quickly, its late here, perhaps Ian can chime in and
> point those fixes here. I'll try and continue tomorrow.
>
> - Arnaldo
>
> > ARM64 eyes on this would also be good. Adding Mark Rutland and Leo Yan
> > to the CC list, maybe they can help us here with the best course of
> > action.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ