[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fXmzETmP3Ra3z_nM-Go2X6SRZ3im+bA0CLvdXS=VtJWOw@mail.gmail.com>
Date: Mon, 21 Jul 2025 10:44:09 -0700
From: Ian Rogers <irogers@...gle.com>
To: James Clark <james.clark@...aro.org>
Cc: Thomas Falcon <thomas.falcon@...el.com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
Ben Gainey <ben.gainey@....com>, Howard Chu <howardchu95@...il.com>,
Weilin Wang <weilin.wang@...el.com>, Levi Yun <yeoreum.yun@....com>,
"Dr. David Alan Gilbert" <linux@...blig.org>, Zhongqiu Han <quic_zhonhan@...cinc.com>,
Blake Jones <blakejones@...gle.com>, Yicong Yang <yangyicong@...ilicon.com>,
Anubhav Shelat <ashelat@...hat.com>, Thomas Richter <tmricht@...ux.ibm.com>,
Jean-Philippe Romain <jean-philippe.romain@...s.st.com>, Song Liu <song@...nel.org>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 00/12] CPU mask improvements/fixes particularly for hybrid
On Mon, Jul 21, 2025 at 9:13 AM James Clark <james.clark@...aro.org> wrote:
>
>
>
> On 27/06/2025 8:24 pm, Ian Rogers wrote:
> > On hybrid systems some PMUs apply to all core types, particularly for
> > metrics the msr PMU and the tsc event. The metrics often only want the
> > values of the counter for their specific core type. These patches
> > allow the cpu term in an event to give a PMU name to take the cpumask
> > from. For example:
> >
> > $ perf stat -e msr/tsc,cpu=cpu_atom/ ...
> >
> > will aggregate the msr/tsc/ value but only for atom cores. In doing
> > this problems were identified in how cpumasks are handled by parsing
> > and event setup when cpumasks are specified along with a task to
> > profile. The event parsing, cpumask evlist propagation code and perf
> > stat code are updated accordingly.
> >
> > The final result of the patch series is to be able to run:
> > ```
> > $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
> > 10.1: Basic parsing test : Ok
> > 10.2: Parsing without PMU name : Ok
> > 10.3: Parsing with PMU name : Ok
> >
> > Performance counter stats for 'perf test -F 10':
> >
> > 63,704,975 msr/tsc/
> > 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%)
> > 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%)
> > ```
> >
> > This has (further) identified a kernel bug for task events around the
> > enabled time being too large leading to invalid scaling (hence the
> > --no-scale in the command line above).
> >
> > Ian Rogers (12):
> > perf parse-events: Warn if a cpu term is unsupported by a CPU
> > perf stat: Avoid buffer overflow to the aggregation map
> > perf stat: Don't size aggregation ids from user_requested_cpus
> > perf parse-events: Allow the cpu term to be a PMU
> > perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask
> > libperf evsel: Rename own_cpus to pmu_cpus
> > libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete
> > perf evsel: Use libperf perf_evsel__exit
> > perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu
> > perf parse-events: Minor __add_event refactoring
> > perf evsel: Add evsel__open_per_cpu_and_thread
> > perf parse-events: Support user CPUs mixed with threads/processes
> >
> > tools/lib/perf/evlist.c | 118 ++++++++++++++++--------
> > tools/lib/perf/evsel.c | 9 +-
> > tools/lib/perf/include/internal/evsel.h | 3 +-
> > tools/perf/builtin-stat.c | 9 +-
> > tools/perf/tests/event_update.c | 4 +-
> > tools/perf/util/evlist.c | 15 +--
> > tools/perf/util/evsel.c | 55 +++++++++--
> > tools/perf/util/evsel.h | 5 +
> > tools/perf/util/expr.c | 2 +-
> > tools/perf/util/header.c | 4 +-
> > tools/perf/util/parse-events.c | 102 ++++++++++++++------
> > tools/perf/util/pmus.c | 29 +++---
> > tools/perf/util/pmus.h | 2 +
> > tools/perf/util/stat.c | 6 +-
> > tools/perf/util/synthetic-events.c | 4 +-
> > tools/perf/util/tool_pmu.c | 56 +++++++++--
> > tools/perf/util/tool_pmu.h | 2 +-
> > 17 files changed, 297 insertions(+), 128 deletions(-)
> >
>
> Tested-by: James Clark <james.clark@...aro.org>
Much appreciated, thanks James!
There's a v2 patch set but the Tested-by will be good for the majority
of patches that are unchanged in that:
https://lore.kernel.org/lkml/20250717210233.1143622-1-irogers@google.com/
I'm of course interested in getting RFC feedback on:
https://lore.kernel.org/lkml/20250716223924.825772-1-irogers@google.com/
which introduces an extra state to avoid gathering enabled time on
CPUs an event can't run on.
Thanks,
Ian
Powered by blists - more mailing lists