lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZV38z3+p2S2ETtzG@kernel.org>
Date:   Wed, 22 Nov 2023 10:06:23 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Hector Martin <marcan@...can.st>
Cc:     Ian Rogers <irogers@...gle.com>,
        Mark Rutland <mark.rutland@....com>,
        Marc Zyngier <maz@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        James Clark <james.clark@....com>,
        linux-perf-users@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>,
        Asahi Linux <asahi@...ts.linux.dev>
Subject: Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since
 v6.5

Em Wed, Nov 22, 2023 at 12:23:27PM +0900, Hector Martin escreveu:
> On 2023/11/22 1:38, Ian Rogers wrote:
> > On Tue, Nov 21, 2023 at 8:15 AM Mark Rutland <mark.rutland@....com> wrote:
> >> On Tue, Nov 21, 2023 at 08:09:37AM -0800, Ian Rogers wrote:
> >>> On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@....com> wrote:
> >>>> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote:
> >>>>> On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@....com> wrote:
> >>>>>> On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote:
> >>>>>>> On Tue, 21 Nov 2023 13:40:31 +0000,
> >>>>>>> Marc Zyngier <maz@...nel.org> wrote:
> >>>>>>>>
> >>>>>>>> [Adding key people on Cc]
> >>>>>>>>
> >>>>>>>> On Tue, 21 Nov 2023 12:08:48 +0000,
> >>>>>>>> Hector Martin <marcan@...can.st> wrote:
> >>>>>>>>>
> >>>>>>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and
> >>>>>>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5.
> >>>>>>>>
> >>>>>>>> I can confirm that at least on 6.7-rc2, perf is pretty busted on any
> >>>>>>>> asymmetric ARM platform. It isn't clear what criteria is used to pick
> >>>>>>>> the PMU, but nothing works anymore.
> >>>>>>>>
> >>>>>>>> The saving grace in my case is that Debian still ships a 6.1 perftool
> >>>>>>>> package, but that's obviously not going to last.
> >>>>>>>>
> >>>>>>>> I'm happy to test potential fixes.
> >>>>>>>
> >>>>>>> At Mark's request, I've dumped a couple of perf (as of -rc2) runs with
> >>>>>>> -vvv.  And it is quite entertaining (this is taskset to an 'icestorm'
> >>>>>>> CPU):
> >>>>>>
> >>>>>> IIUC the tool is doing the wrong thing here and overriding explicit
> >>>>>> ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using
> >>>>>> that ${pmu}'s type and event namespace.
> >>>>>>
> >>>>>> Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be
> >>>>>> targetted to a specific PMU, it's semantically wrong to rewrite events like
> >>>>>> this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named
> >>>>>> PERF_COUNT_HW_${EVENT}.
> >>>>>
> >>>>> If you name a PMU and an event then the event should only be opened on
> >>>>> that PMU, 100% agree. There's a bunch of output, but when the legacy
> >>>>> cycles event is opened it appears to be because it was explicitly
> >>>>> requested.
> >>>>
> >>>> I think you've missed that the named PMU events are being erreously transformed
> >>>> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g.
> >>>>
> >>>>   Opening: apple_firestorm_pmu/cycles/
> >>>>   ------------------------------------------------------------
> >>>>   perf_event_attr:
> >>>>     type                             0 (PERF_TYPE_HARDWARE)
> >>>>     size                             136
> >>>>     config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> >>>>     sample_type                      IDENTIFIER
> >>>>     read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> >>>>     disabled                         1
> >>>>     inherit                          1
> >>>>     enable_on_exec                   1
> >>>>     exclude_guest                    1
> >>>>   ------------------------------------------------------------
> >>>>   sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 4
> >>>>
> >>>> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES.
> >>>>
> >>>> Marc said that he bisected the issue down to commit:
> >>>>
> >>>>   5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms")
> >>>>
> >>>> ... so it looks like something is going wrong when the events are being parsed,
> >>>> e.g. losing the HW PMU information?
> >>>
> >>> Ok, I think I'm getting confused by other things. This looks like the issue.
> >>>
> >>> I think it may be working as intended, but not how you intended :-) If
> >>> a core PMU is listed and then a legacy event, the legacy event should

The point is that "cycles" when prefixed with "pmu/" shouldn't be
considered "cycles" as HW/0, in that setting it is "cycles" for that
PMU. (but we only have "cpu_cycles" for at least the a53 and a72 PMUs I
have access in a Libre Computer rockchip 3399-pc hybrid board, if we use
it, then we get what we want/had before, see below):

And there is an attempt at using the specified PMU, see the first
perf_event_open:

root@...-rk3399-pc:~# strace -e perf_event_open perf stat -vv -e cycles,armv8_cortex_a53/cycles/,armv8_cortex_a72/cycles/ echo
Using CPUID 0x00000000410fd082
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0x700000000
  disabled                         1
------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES, sample_period=0, sample_type=0, read_format=0, disabled=1, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOENT (No such file or directory)
                                                   
//// HERE: it tries config=0x7<<32|PERF_COUNT_HW_CPU_CYCLES taking into
//account the PMU number 0x7

root@...-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/type
7
root@...-rk3399-pc:~#

But then we don't have "cycles" in that PMU:

root@...-rk3399-pc:~# ls -la /sys/devices/armv8_cortex_a53/events/cycles
ls: cannot access '/sys/devices/armv8_cortex_a53/events/cycles': No such file or directory
root@...-rk3399-pc:~#

Maybe:

root@...-rk3399-pc:~# taskset -c 5,6 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo
Using CPUID 0x00000000410fd034
Control descriptor is not initialized

armv8_cortex_a53/cpu_cycles/: 0 2079000 0
armv8_cortex_a72/cpu_cycles/: 2488961 2079000 2079000

 Performance counter stats for 'echo':

     <not counted>      armv8_cortex_a53/cpu_cycles/                                            (0.00%)
           2488961      armv8_cortex_a72/cpu_cycles/

       0.003449266 seconds time elapsed

       0.003502000 seconds user
       0.000000000 seconds sys


root@...-rk3399-pc:~# taskset -c 0,1,2,3,4 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo
Using CPUID 0x00000000410fd034
Control descriptor is not initialized

armv8_cortex_a53/cpu_cycles/: 2986601 6999416 6999416
armv8_cortex_a72/cpu_cycles/: 0 6999416 0

 Performance counter stats for 'echo':

           2986601      armv8_cortex_a53/cpu_cycles/
     <not counted>      armv8_cortex_a72/cpu_cycles/                                            (0.00%)

       0.011434508 seconds time elapsed

       0.003911000 seconds user
       0.007454000 seconds sys


root@...-rk3399-pc:~#

root@...-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/events/cpu_cycles
event=0x0011
root@...-rk3399-pc:~# cat /sys/devices/armv8_cortex_a72/events/cpu_cycles
event=0x0011
root@...-rk3399-pc:~#

And the syscalls seem sane:

root@...-rk3399-pc:~# strace -e perf_event_open taskset -c 0,1,2,3,4 perf stat -v -e armv8_cortex_a53/cpu_cycles/,armv8_cortex_a72/cpu_cycles/ echo
Using CPUID 0x00000000410fd034
Control descriptor is not initialized
perf_event_open({type=0x7 /* PERF_TYPE_??? */, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0x11, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 14573, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
perf_event_open({type=0x8 /* PERF_TYPE_??? */, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0x11, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 14573, -1, -1, PERF_FLAG_FD_CLOEXEC) = 4

--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14573, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
armv8_cortex_a53/cpu_cycles/: 3227098 4480875 4480875
armv8_cortex_a72/cpu_cycles/: 0 4480875 0

 Performance counter stats for 'echo':

           3227098      armv8_cortex_a53/cpu_cycles/
     <not counted>      armv8_cortex_a72/cpu_cycles/                                            (0.00%)

       0.008381759 seconds time elapsed

       0.004064000 seconds user
       0.004121000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=14572, si_uid=0} ---
+++ exited with 0 +++
root@...-rk3399-pc:~#

As:

root@...-rk3399-pc:~# cat /sys/devices/armv8_cortex_a53/type
7
root@...-rk3399-pc:~# cat /sys/devices/armv8_cortex_a72/type
8
root@...-rk3399-pc:~#

See the type=0x7 and type=0x8.

So what we need here seems to be to translate the generic term "cycles"
to "cpu_cycles" when a PMU is explicitely passed in the event name and
it doesn't have "cycles" and then just retry.

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ