lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUWm7efu3xdUBbiifs_KNU1igwAxbXmum=V38SjHQHtXg@mail.gmail.com>
Date:   Tue, 21 Nov 2023 08:09:37 -0800
From:   Ian Rogers <irogers@...gle.com>
To:     Mark Rutland <mark.rutland@....com>
Cc:     Marc Zyngier <maz@...nel.org>, Hector Martin <marcan@...can.st>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        James Clark <james.clark@....com>,
        linux-perf-users@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>,
        Asahi Linux <asahi@...ts.linux.dev>
Subject: Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5

On Tue, Nov 21, 2023 at 8:03 AM Mark Rutland <mark.rutland@....com> wrote:
>
> On Tue, Nov 21, 2023 at 07:46:57AM -0800, Ian Rogers wrote:
> > On Tue, Nov 21, 2023 at 7:40 AM Mark Rutland <mark.rutland@....com> wrote:
> > >
> > > On Tue, Nov 21, 2023 at 03:24:25PM +0000, Marc Zyngier wrote:
> > > > On Tue, 21 Nov 2023 13:40:31 +0000,
> > > > Marc Zyngier <maz@...nel.org> wrote:
> > > > >
> > > > > [Adding key people on Cc]
> > > > >
> > > > > On Tue, 21 Nov 2023 12:08:48 +0000,
> > > > > Hector Martin <marcan@...can.st> wrote:
> > > > > >
> > > > > > Perf broke on all Apple ARM64 systems (tested almost everything), and
> > > > > > according to maz also on Juno (so, probably all big.LITTLE) since v6.5.
> > > > >
> > > > > I can confirm that at least on 6.7-rc2, perf is pretty busted on any
> > > > > asymmetric ARM platform. It isn't clear what criteria is used to pick
> > > > > the PMU, but nothing works anymore.
> > > > >
> > > > > The saving grace in my case is that Debian still ships a 6.1 perftool
> > > > > package, but that's obviously not going to last.
> > > > >
> > > > > I'm happy to test potential fixes.
> > > >
> > > > At Mark's request, I've dumped a couple of perf (as of -rc2) runs with
> > > > -vvv.  And it is quite entertaining (this is taskset to an 'icestorm'
> > > > CPU):
> > >
> > > IIUC the tool is doing the wrong thing here and overriding explicit
> > > ${pmu}/${event}/ events with PERF_TYPE_HARDWARE events rather than events using
> > > that ${pmu}'s type and event namespace.
> > >
> > > Regardless of the *new* ABI that allows PERF_TYPE_HARDWARE events to be
> > > targetted to a specific PMU, it's semantically wrong to rewrite events like
> > > this since ${pmu}/${event}/ is not necessarily equivalent to a similarly-named
> > > PERF_COUNT_HW_${EVENT}.
> >
> > If you name a PMU and an event then the event should only be opened on
> > that PMU, 100% agree. There's a bunch of output, but when the legacy
> > cycles event is opened it appears to be because it was explicitly
> > requested.
>
> I think you've missed that the named PMU events are being erreously transformed
> into PERF_TYPE_HARDWARE events. Look at the -vvv output, e.g.
>
>   Opening: apple_firestorm_pmu/cycles/
>   ------------------------------------------------------------
>   perf_event_attr:
>     type                             0 (PERF_TYPE_HARDWARE)
>     size                             136
>     config                           0 (PERF_COUNT_HW_CPU_CYCLES)
>     sample_type                      IDENTIFIER
>     read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>     disabled                         1
>     inherit                          1
>     enable_on_exec                   1
>     exclude_guest                    1
>   ------------------------------------------------------------
>   sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 4
>
> ... which should not be PERF_TYPE_HARDWARE && PERF_COUNT_HW_CPU_CYCLES.
>
> Marc said that he bisected the issue down to commit:
>
>   5ea8f2ccffb23983 ("perf parse-events: Support hardware events as terms")
>
> ... so it looks like something is going wrong when the events are being parsed,
> e.g. losing the HW PMU information?

Ok, I think I'm getting confused by other things. This looks like the issue.

I think it may be working as intended, but not how you intended :-) If
a core PMU is listed and then a legacy event, the legacy event should
be opened on the core PMU as a legacy event with the extended type
set. This is to allow things like legacy cache events to be opened on
a specified PMU. Legacy event names match with a higher priority than
those in sysfs or json as they are hard coded. Presumably the
expectation was that by advertising a cycles event, presumably in
sysfs, then this is what would be matched.

Thanks,
Ian

> Thanks,
> Mark.
>
> >
> >
> > Thanks,
> > Ian
> >
> > > Mark.
> > >
> > > > <quote>
> > > > maz@...ley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 0 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e
> > > >  apple_firestorm_pmu/cycles/ -e cycles ls
> > > > Using CPUID 0x00000000612f0280
> > > > Attempt to add: apple_icestorm_pmu/cycles=0/
> > > > ..after resolving event: apple_icestorm_pmu/cycles=0/
> > > > Opening: unknown-hardware:HG
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   config                           0xb00000000
> > > >   disabled                         1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
> > > > sys_perf_event_open failed, error -95
> > > > Attempt to add: apple_firestorm_pmu/cycles=0/
> > > > ..after resolving event: apple_firestorm_pmu/cycles=0/
> > > > Control descriptor is not initialized
> > > > Opening: apple_icestorm_pmu/cycles/
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   size                             136
> > > >   config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> > > >   sample_type                      IDENTIFIER
> > > >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > > >   disabled                         1
> > > >   inherit                          1
> > > >   enable_on_exec                   1
> > > >   exclude_guest                    1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 3
> > > > Opening: apple_firestorm_pmu/cycles/
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   size                             136
> > > >   config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> > > >   sample_type                      IDENTIFIER
> > > >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > > >   disabled                         1
> > > >   inherit                          1
> > > >   enable_on_exec                   1
> > > >   exclude_guest                    1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 4
> > > > Opening: cycles
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   size                             136
> > > >   config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> > > >   sample_type                      IDENTIFIER
> > > >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > > >   disabled                         1
> > > >   inherit                          1
> > > >   enable_on_exec                   1
> > > >   exclude_guest                    1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 1045843  cpu -1  group_fd -1  flags 0x8 = 5
> > > > arch                  builtin-diff.o      builtin-mem.o        common-cmds.h    perf-completion.sh
> > > > bench                 builtin-evlist.c    builtin-probe.c      CREDITS          perf.h
> > > > Build                 builtin-evlist.o    builtin-probe.o      design.txt       perf-in.o
> > > > builtin-annotate.c    builtin-ftrace.c    builtin-record.c     dlfilters        perf-iostat
> > > > builtin-annotate.o    builtin-ftrace.o    builtin-record.o     Documentation    perf-iostat.sh
> > > > builtin-bench.c               builtin.h           builtin-report.c     FEATURE-DUMP     perf.o
> > > > builtin-bench.o               builtin-help.c      builtin-report.o     include          perf-read-vdso.c
> > > > builtin-buildid-cache.c  builtin-help.o      builtin-sched.c   jvmti            perf-sys.h
> > > > builtin-buildid-cache.o  builtin-inject.c    builtin-script.c  libapi   PERF-VERSION-FILE
> > > > builtin-buildid-list.c        builtin-inject.o    builtin-script.o     libperf          perf-with-kcore
> > > > builtin-buildid-list.o        builtin-kallsyms.c  builtin-stat.c       libsubcmd        pmu-events
> > > > builtin-c2c.c         builtin-kallsyms.o  builtin-stat.o       libsymbol        python
> > > > builtin-c2c.o         builtin-kmem.c      builtin-timechart.c  Makefile         python_ext_build
> > > > builtin-config.c      builtin-kvm.c       builtin-top.c        Makefile.config  scripts
> > > > builtin-config.o      builtin-kvm.o       builtin-top.o        Makefile.perf    tests
> > > > builtin-daemon.c      builtin-kwork.c     builtin-trace.c      MANIFEST         trace
> > > > builtin-daemon.o      builtin-list.c      builtin-version.c    perf             ui
> > > > builtin-data.c                builtin-list.o      builtin-version.o    perf-archive     util
> > > > builtin-data.o                builtin-lock.c      check-headers.sh     perf-archive.sh
> > > > builtin-diff.c                builtin-mem.c       command-list.txt     perf.c
> > > > apple_icestorm_pmu/cycles/: -1: 0 873709 0
> > > > apple_firestorm_pmu/cycles/: -1: 0 873709 0
> > > > cycles: -1: 0 873709 0
> > > > apple_icestorm_pmu/cycles/: 0 873709 0
> > > > apple_firestorm_pmu/cycles/: 0 873709 0
> > > > cycles: 0 873709 0
> > > >
> > > >  Performance counter stats for 'ls':
> > > >
> > > >      <not counted>      apple_icestorm_pmu/cycles/                                              (0.00%)
> > > >      <not counted>      apple_firestorm_pmu/cycles/                                             (0.00%)
> > > >      <not counted>      cycles                                                                  (0.00%)
> > > >
> > > >        0.000002250 seconds time elapsed
> > > >
> > > >        0.000000000 seconds user
> > > >        0.000000000 seconds sys
> > > > </quote>
> > > >
> > > > If I run the same thing on another CPU cluster (firestorm), I get
> > > > this:
> > > >
> > > > <quote>
> > > > maz@...ley-girl:~/hot-poop/arm-platforms/tools/perf$ sudo taskset -c 2 ./perf stat -vvv -e apple_icestorm_pmu/cycles/ -e
> > > >  apple_firestorm_pmu/cycles/ -e cycles ls
> > > > Using CPUID 0x00000000612f0280
> > > > Attempt to add: apple_icestorm_pmu/cycles=0/
> > > > ..after resolving event: apple_icestorm_pmu/cycles=0/
> > > > Opening: unknown-hardware:HG
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   config                           0xb00000000
> > > >   disabled                         1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
> > > > sys_perf_event_open failed, error -95
> > > > Attempt to add: apple_firestorm_pmu/cycles=0/
> > > > ..after resolving event: apple_firestorm_pmu/cycles=0/
> > > > Control descriptor is not initialized
> > > > Opening: apple_icestorm_pmu/cycles/
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   size                             136
> > > >   config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> > > >   sample_type                      IDENTIFIER
> > > >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > > >   disabled                         1
> > > >   inherit                          1
> > > >   enable_on_exec                   1
> > > >   exclude_guest                    1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 1045925  cpu -1  group_fd -1  flags 0x8 = 3
> > > > Opening: apple_firestorm_pmu/cycles/
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   size                             136
> > > >   config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> > > >   sample_type                      IDENTIFIER
> > > >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > > >   disabled                         1
> > > >   inherit                          1
> > > >   enable_on_exec                   1
> > > >   exclude_guest                    1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 1045925  cpu -1  group_fd -1  flags 0x8 = 4
> > > > Opening: cycles
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > >   type                             0 (PERF_TYPE_HARDWARE)
> > > >   size                             136
> > > >   config                           0 (PERF_COUNT_HW_CPU_CYCLES)
> > > >   sample_type                      IDENTIFIER
> > > >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> > > >   disabled                         1
> > > >   inherit                          1
> > > >   enable_on_exec                   1
> > > >   exclude_guest                    1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 1045925  cpu -1  group_fd -1  flags 0x8 = 5
> > > > arch                  builtin-diff.o      builtin-mem.o        common-cmds.h    perf-completion.sh
> > > > bench                 builtin-evlist.c    builtin-probe.c      CREDITS          perf.h
> > > > Build                 builtin-evlist.o    builtin-probe.o      design.txt       perf-in.o
> > > > builtin-annotate.c    builtin-ftrace.c    builtin-record.c     dlfilters        perf-iostat
> > > > builtin-annotate.o    builtin-ftrace.o    builtin-record.o     Documentation    perf-iostat.sh
> > > > builtin-bench.c               builtin.h           builtin-report.c     FEATURE-DUMP     perf.o
> > > > builtin-bench.o               builtin-help.c      builtin-report.o     include          perf-read-vdso.c
> > > > builtin-buildid-cache.c  builtin-help.o      builtin-sched.c   jvmti            perf-sys.h
> > > > builtin-buildid-cache.o  builtin-inject.c    builtin-script.c  libapi   PERF-VERSION-FILE
> > > > builtin-buildid-list.c        builtin-inject.o    builtin-script.o     libperf          perf-with-kcore
> > > > builtin-buildid-list.o        builtin-kallsyms.c  builtin-stat.c       libsubcmd        pmu-events
> > > > builtin-c2c.c         builtin-kallsyms.o  builtin-stat.o       libsymbol        python
> > > > builtin-c2c.o         builtin-kmem.c      builtin-timechart.c  Makefile         python_ext_build
> > > > builtin-config.c      builtin-kvm.c       builtin-top.c        Makefile.config  scripts
> > > > builtin-config.o      builtin-kvm.o       builtin-top.o        Makefile.perf    tests
> > > > builtin-daemon.c      builtin-kwork.c     builtin-trace.c      MANIFEST         trace
> > > > builtin-daemon.o      builtin-list.c      builtin-version.c    perf             ui
> > > > builtin-data.c                builtin-list.o      builtin-version.o    perf-archive     util
> > > > builtin-data.o                builtin-lock.c      check-headers.sh     perf-archive.sh
> > > > builtin-diff.c                builtin-mem.c       command-list.txt     perf.c
> > > > apple_icestorm_pmu/cycles/: -1: 1035101 469125 469125
> > > > apple_firestorm_pmu/cycles/: -1: 1035035 469125 469125
> > > > cycles: -1: 1034653 469125 469125
> > > > apple_icestorm_pmu/cycles/: 1035101 469125 469125
> > > > apple_firestorm_pmu/cycles/: 1035035 469125 469125
> > > > cycles: 1034653 469125 469125
> > > >
> > > >  Performance counter stats for 'ls':
> > > >
> > > >          1,035,101      apple_icestorm_pmu/cycles/
> > > >          1,035,035      apple_firestorm_pmu/cycles/
> > > >          1,034,653      cycles
> > > >
> > > >        0.000001333 seconds time elapsed
> > > >
> > > >        0.000000000 seconds user
> > > >        0.000000000 seconds sys
> > > > </quote>
> > > >
> > > > which doesn't make any sense either. I really don't understand what
> > > > this PERF_TYPE_HARDWARE does here (the *real* types are 10 and 11),
> > > > nor what this 'cycle=0' stuff is.
> > > >
> > > > /puzzled
> > > >
> > > >       M.
> > > >
> > > > --
> > > > Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ