lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUtS5-vHj+GWDsyWG-2CUYy6e0qhjuUtD9x7FudPsRpOQ@mail.gmail.com>
Date: Wed, 5 Mar 2025 12:38:19 -0800
From: Ian Rogers <irogers@...gle.com>
To: James Clark <james.clark@...aro.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Robin Murphy <robin.murphy@....com>, 
	Leo Yan <leo.yan@....com>, linux-perf-users@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] perf list: Collapse similar events across PMUs

On Tue, Mar 4, 2025 at 5:50 AM James Clark <james.clark@...aro.org> wrote:
>
> Some changes related to the discussion here [1] about deduplication of
> PMUs. This change stands on its own right now, but it may not go far
> enough. This can either be dropped for now or applied and improved
> later. Extra ideas are as follows.
>
> Treating alphanumeric suffixes as duplicate has been slightly
> problematic due to marketing strings having looks-like-but-not-actually
> alphanumeric suffixes. For example 'cpum_cf' and now the one digit
> longer than before 'cortex-a720'. The easy fix is to increase the
> minimum length considered for deduplication as suggested [1], but as
> also mentioned, the current mrvl_ddr_pmu PMU names don't zero pad the
> address, meaning that > 2 alphanumeric suffixes are already technically
> not enough to deduplicate the same PMUs. They could have only a 2 digit
> alphanumeric address suffix. Increasing the minimum digits feels a bit
> like kicking the can down the road and it places awkward limitations on
> marketing names which we have no control over. Also I'm not sure helps
> the following much:

The hex suffix thing was very unfortunate, can we reverse that
decision and move the physical address.. it captures into a caps file?

Fwiw, we have a similar hex mess with raw events. A raw event can be
r0xead or read, but read also makes a pretty nice event name. This is
solved by first checking if read is an event and if not assuming it is
a raw encoding. Would anyone really care if raw events always had to
start with r0x? We seem to get tangled up in corner cases like this
too often, for example legacy event priorities and topdown event
sorting. Keeping things simple would be nice from a correctness pov
but also so that it's easy for other tools to emulate.

> The problem is that arm_cmn_[n] PMUs have a numeric suffix, but they can
> have different events. Even if we were adding this PMU today, keeping
> the suffix rule in mind, it would be difficult to come up with a suffix
> that differentiates the different ones. Flavour words might work, but
> that complicates the kernel which would have to group them and come up
> with flavours rather than just doing an i++. Deduplicating too
> aggressively on only PMU name suffix means only arm_cmn_1's events get
> listed, missing other events, and it's hard to see which events relate
> to which PMU.
>
> Therefore in addition to the changes in this patchset I'd like to look
> into:
>
>  * Collapsing duplicate PMU names into ranges, for example
>    arm_pmu_v3_[0-4], rather than simply concatenating names as done in
>    this patchset

The current suffix rule is weird as Intel's GPU i915 PMU appears to be
a PMU called 'i' with a 915 numeric suffix.
I think capturing the rules in the ABI doc and some kind of legacy
compatibility is do-able.
I like regular expressions over globs/fnmatch, for example, we could
use flex to turn the CPUID matches in mapfile.csv into something that
is parsed and matched with less interpretation/compilation at runtime.
I suspect we can bike-shed for a long time on what these new rules
will be, which has been why I've generally just tried to match
existing and inelegant behaviors.

>  * Deduplicate uncore based on the contents of events/ rather than just
>    the suffix
>
> As some background, the original commit for deduplication, commit
> 3241d46f5f54 ("perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu")
> mentions reducing the number of duplicate PMUs, and is presumably
> motivated by usability. But there are also other commits mentioning
> reducing openat()s, for example lazily loading formats 504026412162
> ("perf pmu: Make the loading of formats lazy"). Deduplicating based on
> the contents of the events/ folder is somewhat in contention with this
> reduction, but could be done along side some more lazy loading (like of
> the terms) and hashing the result of readdir() without opening any of
> the contents. JSON tables can have event name hashes calculated at build
> time if we want to consider them for deduplication too.

I worry about the run time cost of this when there are 100s of uncore
PMUs. I wonder if the case of an empty events directory would also be
common.
Wrt the x86 conventions I think Kan is best placed to say what the
preferred convention should be.
The json will routinely say this event is on a PMU name without a
suffix, and the number of PMUs will only be known at runtime. Hashing
without the suffix would be fine but I think the controversy is over
what should be considered a suffix.

Thanks for doing the series, I'll dig into/test each one in turn.
Ian

> Then with the events hash, PMU's can be sorted based on this and the
> 'Unit:' string can be constructed with a set of values that collapses
> adjacent suffixes to display as ranges. I believe that could remove the
> need for any further changes to duplication based on suffix, but still
> avoids over deduplication.
>
> [1]: https://lore.kernel.org/linux-perf-users/CAP-5=fW_Sq4iFxoWPWuixz9fMLBPyPUO0RG0KPbYa-5T0DZbTA@mail.gmail.com/
>
> ---
> James Clark (3):
>       perf list: Order events by event name before PMU name
>       perf list: Collapse similar events across PMUs
>       perf list: Don't deduplicate core PMUs when listing events
>
>  tools/perf/builtin-list.c      |  2 +
>  tools/perf/util/pmu.c          |  5 ++-
>  tools/perf/util/pmu.h          |  2 +
>  tools/perf/util/pmus.c         | 95 ++++++++++++++++++++++++++++++++++--------
>  tools/perf/util/print-events.h |  1 +
>  5 files changed, 86 insertions(+), 19 deletions(-)
> ---
> base-commit: 7788ad59d1d9617792037a83513be5b1dd14150f
> change-id: 20250304-james-perf-hybrid-list-11b888cf435a
>
> Best regards,
> --
> James Clark <james.clark@...aro.org>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ