lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251029053413.355154-1-irogers@google.com>
Date: Tue, 28 Oct 2025 22:33:58 -0700
From: Ian Rogers <irogers@...gle.com>
To: Suzuki K Poulose <suzuki.poulose@....com>, Mike Leach <mike.leach@...aro.org>, 
	James Clark <james.clark@...aro.org>, John Garry <john.g.garry@...cle.com>, 
	Will Deacon <will@...nel.org>, Leo Yan <leo.yan@...ux.dev>, 
	Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Charlie Jenkins <charlie@...osinc.com>, 
	Thomas Falcon <thomas.falcon@...el.com>, Yicong Yang <yangyicong@...ilicon.com>, 
	Thomas Richter <tmricht@...ux.ibm.com>, Athira Rajeev <atrajeev@...ux.ibm.com>, 
	Howard Chu <howardchu95@...il.com>, Song Liu <song@...nel.org>, 
	Dapeng Mi <dapeng1.mi@...ux.intel.com>, Levi Yun <yeoreum.yun@....com>, 
	Zhongqiu Han <quic_zhonhan@...cinc.com>, Blake Jones <blakejones@...gle.com>, 
	Anubhav Shelat <ashelat@...hat.com>, Chun-Tse Shao <ctshao@...gle.com>, 
	Christophe Leroy <christophe.leroy@...roup.eu>, 
	Jean-Philippe Romain <jean-philippe.romain@...s.st.com>, Gautam Menghani <gautam@...ux.ibm.com>, 
	Dmitry Vyukov <dvyukov@...gle.com>, Yang Li <yang.lee@...ux.alibaba.com>, 
	linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org, 
	Andi Kleen <ak@...ux.intel.com>, Weilin Wang <weilin.wang@...el.com>
Subject: [RFC PATCH v1 00/15] Addition of session API to python module

The perf script command uses a session with process_events to call
through to the python process_events function. The event is turned
into a python dictionary, whether the entries are used or not, adding
overhead. To avoid the overhead, add a session API abstraction and
pass callbacks that can be used to perform the existing perf script
functions. The implementation is incomplete in this RFC.

In this series the mem-phys-addr.py command is ported from perf script
to using the session API. The performance before and after is:

Before:
```
$ perf mem record -a sleep 1
$ time perf script tools/perf/scripts/python/mem-phys-addr.py
Event: cpu_core/mem-loads-aux/
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m3.754s
user    0m0.023s
sys     0m0.018s
```

After:
```
$ PYTHONPATH=/tmp/perf/python time python3 tools/perf/python/mem-phys-addr.py
Event: evsel(cpu_core/mem-loads-aux/)
Memory type                                    count  percentage
 ---------------------------------------  ----------  ----------
0-fff : Reserved                                3217       100.0

real    0m0.106s
user    0m0.021s
sys     0m0.020s
```

So a roughly 35x speedup, but it maybe that some of that is one time
start-up overhead of libpython which wouldn't be present for larger
perf.data files.

Before porting all the script commands and adding things like
callchain support to the python module, I wanted to get feedback. One
thing that particularly simplifies the series is adding reference
counts to evsel and evlist to avoid copying/cloning evsels created by
the session API when loading a perf.data file.

The approach of moving away from libpython and scripts was most
recently discussed as a topic in:
https://lore.kernel.org/lkml/CAP-5=fWDqE8SYfOLZkg_0=4Ayx6E7O+h7uUp4NDeCFkiN4b7-w@mail.gmail.com/

When creating the python wrapper some house keeping was done around
includes and perf_data's encapsulation.

The perf script callbacks differ from those in perf_tool, for example,
stat is the perf_tool callback is for a stat event while the scripting
ops combine things and have a stat callback associated with
stat_round. Should the session API match the tool or the script API?
The former feels better for long term, while the latter could simplify
porting perf scripts.

Ian Rogers (15):
  perf arch arm: Sort includes and add missed explicit dependencies
  perf arch x86: Sort includes and add missed explicit dependencies
  perf tests: Sort includes and add missed explicit dependencies
  perf script: Sort includes and add missed explicit dependencies
  perf util: Sort includes and add missed explicit dependencies
  perf python: Add add missed explicit dependencies
  perf evsel/evlist: Avoid unnecessary #includes
  perf maps: Move getting debug_file to verbose path
  perf data: Clean up use_stdio and structures
  perf python: Add wrapper for perf_data file abstraction
  perf python: Add python session abstraction wrapping perf's session
  perf evlist: Add reference count
  perf evsel: Add reference count
  perf python: Add access to evsel and phys_addr in event
  perf mem-phys-addr.py: Port to standalone application from perf script

 tools/perf/arch/arm/util/cs-etm.c           |  22 +-
 tools/perf/arch/x86/tests/hybrid.c          |   2 +-
 tools/perf/arch/x86/tests/topdown.c         |   2 +-
 tools/perf/arch/x86/util/intel-bts.c        |  14 +-
 tools/perf/arch/x86/util/intel-pt.c         |  31 +-
 tools/perf/arch/x86/util/iostat.c           |   2 +-
 tools/perf/bench/evlist-open-close.c        |  18 +-
 tools/perf/builtin-ftrace.c                 |   8 +-
 tools/perf/builtin-inject.c                 |   7 +-
 tools/perf/builtin-kvm.c                    |   4 +-
 tools/perf/builtin-lock.c                   |   2 +-
 tools/perf/builtin-record.c                 |  14 +-
 tools/perf/builtin-script.c                 | 109 ++--
 tools/perf/builtin-stat.c                   |   8 +-
 tools/perf/builtin-top.c                    |  52 +-
 tools/perf/builtin-trace.c                  |  38 +-
 tools/perf/python/mem-phys-addr.py          | 117 ++++
 tools/perf/tests/backward-ring-buffer.c     |  18 +-
 tools/perf/tests/code-reading.c             |   4 +-
 tools/perf/tests/event-times.c              |   4 +-
 tools/perf/tests/event_update.c             |   2 +-
 tools/perf/tests/evsel-roundtrip-name.c     |   8 +-
 tools/perf/tests/evsel-tp-sched.c           |   4 +-
 tools/perf/tests/expand-cgroup.c            |   8 +-
 tools/perf/tests/hists_cumulate.c           |   2 +-
 tools/perf/tests/hists_filter.c             |   2 +-
 tools/perf/tests/hists_link.c               |   2 +-
 tools/perf/tests/hists_output.c             |   2 +-
 tools/perf/tests/hwmon_pmu.c                |  14 +-
 tools/perf/tests/keep-tracking.c            |   2 +-
 tools/perf/tests/mmap-basic.c               |  31 +-
 tools/perf/tests/openat-syscall-all-cpus.c  |   6 +-
 tools/perf/tests/openat-syscall-tp-fields.c |  18 +-
 tools/perf/tests/openat-syscall.c           |   6 +-
 tools/perf/tests/parse-events.c             |   4 +-
 tools/perf/tests/parse-metric.c             |   4 +-
 tools/perf/tests/parse-no-sample-id-all.c   |   2 +-
 tools/perf/tests/perf-record.c              |  18 +-
 tools/perf/tests/perf-time-to-tsc.c         |   2 +-
 tools/perf/tests/pfm.c                      |   4 +-
 tools/perf/tests/pmu-events.c               |   6 +-
 tools/perf/tests/pmu.c                      |   2 +-
 tools/perf/tests/sw-clock.c                 |  14 +-
 tools/perf/tests/switch-tracking.c          |   2 +-
 tools/perf/tests/task-exit.c                |  14 +-
 tools/perf/tests/tool_pmu.c                 |   2 +-
 tools/perf/tests/topology.c                 |   5 +-
 tools/perf/util/bpf_counter_cgroup.c        |   2 +-
 tools/perf/util/bpf_off_cpu.c               |  28 +-
 tools/perf/util/bpf_trace_augment.c         |   7 +-
 tools/perf/util/cgroup.c                    |   6 +-
 tools/perf/util/data-convert-bt.c           |   2 +-
 tools/perf/util/data.c                      |  81 ++-
 tools/perf/util/data.h                      |  52 +-
 tools/perf/util/evlist.c                    | 100 ++--
 tools/perf/util/evlist.h                    |  23 +-
 tools/perf/util/evsel.c                     | 103 ++--
 tools/perf/util/evsel.h                     |  30 +-
 tools/perf/util/expr.c                      |   2 +-
 tools/perf/util/header.c                    |  12 +-
 tools/perf/util/map.h                       |   6 +-
 tools/perf/util/maps.c                      |   9 +-
 tools/perf/util/metricgroup.c               |   6 +-
 tools/perf/util/parse-events.c              |   4 +-
 tools/perf/util/parse-events.y              |   2 +-
 tools/perf/util/perf_api_probe.c            |  19 +-
 tools/perf/util/pfm.c                       |   2 +-
 tools/perf/util/print-events.c              |   2 +-
 tools/perf/util/print_insn.h                |   5 +-
 tools/perf/util/python.c                    | 584 +++++++++++++++-----
 tools/perf/util/record.c                    |   2 +-
 tools/perf/util/s390-sample-raw.c           |  15 +-
 tools/perf/util/session.c                   |   4 +-
 tools/perf/util/sideband_evlist.c           |  16 +-
 tools/perf/util/stat-shadow.c               |   1 +
 tools/perf/util/stat.c                      |  15 +-
 76 files changed, 1152 insertions(+), 650 deletions(-)
 create mode 100644 tools/perf/python/mem-phys-addr.py

-- 
2.51.1.851.g4ebd6896fd-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ