lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250719030517.1990983-1-irogers@google.com>
Date: Fri, 18 Jul 2025 20:05:02 -0700
From: Ian Rogers <irogers@...gle.com>
To: Thomas Falcon <thomas.falcon@...el.com>, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>, 
	"Liang, Kan" <kan.liang@...ux.intel.com>, Ravi Bangoria <ravi.bangoria@....com>, 
	James Clark <james.clark@...aro.org>, Dapeng Mi <dapeng1.mi@...ux.intel.com>, 
	Weilin Wang <weilin.wang@...el.com>, Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org
Subject: [PATCH v3 00/15] Fixes for Intel TMA, particularly for hybrid

On hybrid systems some PMUs apply to all core types, particularly for
metrics the msr PMU and the tsc event. The metrics often only want the
values of the counter for their specific core type. These patches
allow the cpu term in an event to give a PMU name to take the cpumask
from. For example:

  $ perf stat -e msr/tsc,cpu=cpu_atom/ ...

will aggregate the msr/tsc/ value but only for atom cores. In doing
this problems were identified in how cpumasks are handled by parsing
and event setup when cpumasks are specified along with a task to
profile. The event parsing, cpumask evlist propagation code and perf
stat code are updated accordingly.

The final result of the patch series is to be able to run:
```
$ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
 10.1: Basic parsing test                                            : Ok
 10.2: Parsing without PMU name                                      : Ok
 10.3: Parsing with PMU name                                         : Ok

 Performance counter stats for 'perf test -F 10':

        63,704,975      msr/tsc/
        47,060,704      msr/tsc,cpu=cpu_core/                        (4.62%)
        16,640,591      msr/tsc,cpu=cpu_atom/                        (2.18%)
```

This has (further) identified a kernel bug for task events around the
enabled time being too large leading to invalid scaling (hence the
 --no-scale in the command line above).

Additionally the series corrects topdown event processing and starts
injecting slots events as preparation for TMA 5.1 whose updates will
be sent as a follow-up patch series.

v3: Fix CPU map computation for uncore/requires_cpu, don't simplify
    for the "any"(-1) CPU case. Combine with topdown slots/fix and
    injection previously:
    https://lore.kernel.org/lkml/20250718132750.1546457-1-irogers@google.com/
    This has a grouping fix added for the injected slots event.  Add
    new no grouping constraint for threshold + NMI for issue with
    alderlake metrics failing to schedule when thresholds are enabled
    along with the NMI watchdog.

v2: Add additional documentation of the cpu term to `perf list`
    (Namhyung), extend the term to also allow CPU ranges. Add Thomas
    Falcon's reviewed-by. Still open for discussion whether the term
    cpu should have >1 variant for PMUs, etc. or whether the single
    term is okay. We could refactor later and add a term, but that
    would break existing users, but they are most likely to be metrics
    so probably not a huge issue.

Ian Rogers (15):
  perf parse-events: Warn if a cpu term is unsupported by a CPU
  perf stat: Avoid buffer overflow to the aggregation map
  perf stat: Don't size aggregation ids from user_requested_cpus
  perf parse-events: Allow the cpu term to be a PMU or CPU range
  perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask
  libperf evsel: Rename own_cpus to pmu_cpus
  libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete
  perf evsel: Use libperf perf_evsel__exit
  perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu
  perf parse-events: Minor __add_event refactoring
  perf evsel: Add evsel__open_per_cpu_and_thread
  perf parse-events: Support user CPUs mixed with threads/processes
  perf topdown: Use attribute to see an event is a topdown metic or
    slots
  perf parse-events: Fix missing slots for Intel topdown metric events
  perf metricgroups: Add NO_THRESHOLD_AND_NMI constraint

 tools/lib/perf/evlist.c                  | 119 +++++++++++++++-------
 tools/lib/perf/evsel.c                   |   9 +-
 tools/lib/perf/include/internal/evsel.h  |   3 +-
 tools/perf/Documentation/perf-list.txt   |  25 +++--
 tools/perf/arch/x86/include/arch-tests.h |   4 +
 tools/perf/arch/x86/tests/Build          |   1 +
 tools/perf/arch/x86/tests/arch-tests.c   |   1 +
 tools/perf/arch/x86/tests/topdown.c      |  76 ++++++++++++++
 tools/perf/arch/x86/util/evlist.c        |  24 +++++
 tools/perf/arch/x86/util/evsel.c         |  46 +++------
 tools/perf/arch/x86/util/topdown.c       |  59 +++++++----
 tools/perf/arch/x86/util/topdown.h       |   6 ++
 tools/perf/builtin-stat.c                |   9 +-
 tools/perf/pmu-events/jevents.py         |   1 +
 tools/perf/pmu-events/pmu-events.h       |  14 ++-
 tools/perf/tests/event_update.c          |   4 +-
 tools/perf/tests/parse-events.c          |  24 ++---
 tools/perf/util/evlist.c                 |  15 +--
 tools/perf/util/evlist.h                 |   1 +
 tools/perf/util/evsel.c                  |  55 ++++++++--
 tools/perf/util/evsel.h                  |   5 +
 tools/perf/util/expr.c                   |   2 +-
 tools/perf/util/header.c                 |   4 +-
 tools/perf/util/metricgroup.c            |  16 ++-
 tools/perf/util/parse-events.c           | 122 ++++++++++++++++++-----
 tools/perf/util/pmus.c                   |  29 +++---
 tools/perf/util/pmus.h                   |   2 +
 tools/perf/util/stat.c                   |   6 +-
 tools/perf/util/synthetic-events.c       |   4 +-
 tools/perf/util/tool_pmu.c               |  56 +++++++++--
 tools/perf/util/tool_pmu.h               |   2 +-
 31 files changed, 532 insertions(+), 212 deletions(-)
 create mode 100644 tools/perf/arch/x86/tests/topdown.c

-- 
2.50.0.727.gbf7dc18ff4-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ