[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <714ed350-0e6c-4922-bf65-36de48f62879@leemhuis.info>
Date: Wed, 7 Aug 2024 10:54:29 +0200
From: Thorsten Leemhuis <regressions@...mhuis.info>
To: Arnaldo Carvalho de Melo <acme@...hat.com>
Cc: Linux perf Profiling <linux-perf-users@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
James Clark <james.clark@....com>, "cc: Marc Zyngier" <maz@...nel.org>,
Hector Martin <marcan@...can.st>, Asahi Linux <asahi@...ts.linux.dev>,
Ian Rogers <irogers@...gle.com>,
Linux regressions mailing list <regressions@...ts.linux.dev>,
"to: Mark Rutland" <mark.rutland@....com>
Subject: Re: [REGRESSION] Perf (userspace) broken on big.LITTLE systems since
v6.5
On 01.08.24 21:05, Ian Rogers wrote:
> On Wed, Dec 6, 2023 at 4:09 AM Linux regression tracking #update
> (Thorsten Leemhuis) <regressions@...mhuis.info> wrote:
>>
>> [TLDR: This mail in primarily relevant for Linux kernel regression
>> tracking. See link in footer if these mails annoy you.]
>>
>> On 22.11.23 00:43, Bagas Sanjaya wrote:
>>> On Tue, Nov 21, 2023 at 09:08:48PM +0900, Hector Martin wrote:
>>>> Perf broke on all Apple ARM64 systems (tested almost everything), and
>>>> according to maz also on Juno (so, probably all big.LITTLE) since v6.5.
>>
>> #regzbot fix: perf parse-events: Make legacy events lower priority than
>> sysfs/JSON
>> #regzbot ignore-activity
>
> Note, this is still broken.
Hmmm, so all that became somewhat messy. Arnaldo, what's the way out of
this? Or is this a "we are screwed one way or another and someone has to
bite the bullet" situation?
Ciao, Thorsten
> The patch changed the priority in the case
> that you do something like:
>
> $ perf stat -e 'armv8_pmuv3_0/cycles/' benchmark
>
> but if you do:
>
> $ perf stat -e 'cycles' benchmark
>
> then the broken behavior will happen as legacy events have priority
> over sysfs/json events in that case. To fix this you need to revert:
> 4f1b067359ac Revert "perf parse-events: Prefer sysfs/JSON hardware
> events over legacy"
>
> This causes some testing issues resolved in this unmerged patch series:
> https://lore.kernel.org/lkml/20240510053705.2462258-1-irogers@google.com/
>
> There is a bug as the arm_dsu PMU advertises an event called "cycles"
> and this PMU is present on Ampere systems. Reverting the commit above
> will cause an issue as the commit 7b100989b4f6 ("perf evlist: Remove
> __evlist__add_default") to fix ARM's BIG.little systems (opening a
> cycles event on all PMUs not just 1) will cause the arm_dsu event to
> be opened by perf record and fail as the event won't support sampling.
>
> The patch https://lore.kernel.org/lkml/20240525152927.665498-1-irogers@google.com/
> fixes this by only opening the cycles event on core PMUs when choosing
> default events.
>
> Rather than take this patch the revert happened as Linus runs the
> command "perf record -e cycles:pp" (ie using a specified event and not
> defaults) and considers it a regression in the perf tool that on an
> Ampere system to need to do "perf record -e
> 'armv8_pmuv3_0/cycles/pp'". It was pointed out that not specifying -e
> will choose the cycles event correctly and with better precision the
> pp for systems that support it, but it was still considered a
> regression in the perf tool so the revert was made to happen. There is
> a lack of perf testing coverage for ARM, in particular as they choose
> to do everything in a different way to x86. The patch in question was
> in the linux-next tree for weeks without issues.
>
> ARM/Ampere could fix this by renaming the event from cycles to
> cpu_cycles, or by following Intel's convention that anything uncore
> uses the name clockticks rather than cycles. This could break people
> who rely on an event called arm_dsu/cycles/ but I imagine such people
> are rare. There has been no progress I'm aware of on renaming the
> event.
>
> Making perf not terminate on opening an event for perf record seems
> like the most likely workaround as that is at least something under
> the tool maintainers control. ARM have discussed doing this on the
> lists:
> https://lore.kernel.org/lkml/f30f676e-a1d7-4d6b-94c1-3bdbd1448887@arm.com/
> but since the revert in v6.10 no patches have appeared for the v6.11
> merge window. Feature work like coresight improvements and ARMv9 are
> being actively pursued by ARM, but feature work won't resolve this
> regression.
>
> I'm keen to see such patches as there are perf stat fixes reliant on
> the stacked parse event fixes that are consequently not merged
> affecting more than just ARM.
>
> There is a related discussion that events specified without PMUs
> should inherently only mean core PMUs. Unfortunately such a change
> would break uncore events specified without a PMU, for example `perf
> stat -e data_read -a sleep 1` gathers read memory bandwidth on uncore
> memory controllers on recent Intel devices. Not specifying a PMU for
> uncore events is also assumed by perf metrics, so a large number of
> metrics would need updating to make such a change work. Many existing
> JSON uncore events specify a PMU in their name like
> UNC_M2HBM_CMS_CLOCKTICKS and it feels somewhat redundant to have to
> make that h2hbm/UNC_M2HBM_CMS_CLOCKTICKS/. It is unclear who would
> pursue fixing all of this, and so it seems not specifying a PMU with
> an event for perf will keep meaning trying to open the event on all
> PMUs that advertise such an event.
>
> Thanks,
> Ian
>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>> --
>> Everything you wanna know about Linux kernel regression tracking:
>> https://linux-regtracking.leemhuis.info/about/#tldr
>> That page also explains what to do if mails like this annoy you.
>>
>
>
Powered by blists - more mailing lists