[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXnM15Zj5mCYcsd9usUMtHoPOd3Wz8-L1N5UaB-YvKuHQ@mail.gmail.com>
Date: Wed, 17 Sep 2025 08:29:24 -0700
From: Ian Rogers <irogers@...gle.com>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Gautam Menghani <gautam@...ux.ibm.com>, peterz@...radead.org, mingo@...hat.com,
namhyung@...nel.org, mark.rutland@....com, alexander.shishkin@...ux.intel.com,
jolsa@...nel.org, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
maddy@...ux.ibm.com, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] perf python: Add an example for sampling
On Wed, Sep 17, 2025 at 5:37 AM Arnaldo Carvalho de Melo
<acme@...nel.org> wrote:
>
> On Tue, Sep 16, 2025 at 01:07:43PM -0700, Ian Rogers wrote:
> > On Tue, Sep 16, 2025 at 12:25 PM Arnaldo Carvalho de Melo
> > <acme@...nel.org> wrote:
> > >
> > > On Tue, Sep 16, 2025 at 07:00:48PM +0530, Gautam Menghani wrote:
> > > > Hi Ian/Arnaldo,
> > > >
> > > > Can you please review this series and let me know if any changes are
> > > > needed?
> > >
> > > Looking at it now, sry for the delay,
> >
> > I think the patches look good. I'm a little concerned that the python
> > APIs are a chance to do something better than the C APIs that have
> > evolved. For example, we removed UID out of target recently [1] as the
> > BPF alternative was better. Had this patch come earlier then it seems
> > likely we'd have had target with UIDs. I wonder rather than having a
> > kwlist of:
> >
> > + static char *kwlist[] = { "target", "inherit_stat", "no_buffering",
> > "no_inherit",
> > + "no_inherit_set", "no_samples", "raw_samples",
> > + "sample_address", "sample_phys_addr", "sample_data_page_size",
> > + "sample_code_page_size", "sample_weight", "sample_time",
> > + "sample_time_set", "sample_cpu", "sample_identifier",
> > + "sample_data_src", "period", "period_set", "running_time",
> > + "full_auxtrace", "auxtrace_snapshot_mode",
> > + "auxtrace_snapshot_on_exit", "auxtrace_sample_mode",
> > + "record_namespaces", "record_cgroup", "record_switch_events",
> > + "record_switch_events_set", "all_kernel", "all_user",
> > + "kernel_callchains", "user_callchains", "tail_synthesize",
> > + "overwrite", "ignore_missing_thread", "strict_freq", "sample_id",
> > + "no_bpf_event", "kcore", "text_poke", "build_id", "freq",
> > + "mmap_pages", "auxtrace_mmap_pages", "user_freq", "branch_stack",
> > + "sample_intr_regs", "sample_user_regs", "default_interval",
> > + "user_interval", "auxtrace_snapshot_size", "auxtrace_snapshot_opts",
> > + "auxtrace_sample_opts", "sample_transaction", "use_clockid",
> > + "clockid", "clockid_res_ns", "nr_cblocks", "affinity", "mmap_flush",
> > + "comp_level", "nr_threads_synthesize", "ctl_fd", "ctl_fd_ack",
> > + "ctl_fd_close", "synth", "threads_spec", "threads_user_spec",
> > + "off_cpu_thresh_ns", NULL };
> >
> > but then just using this subset:
> >
> > + opts = perf.record_opts(freq=1000, target=tgt, sample_time=True,
> > + sample_cpu=True, no_buffering=True,
> > no_inherit=True)
> >
> > The kwlist should be kept to just those necessary values for the
> > example to work? I kind of see this as Arnaldo's baby, so he may just
> > want everything, so this needn't be a blocker.
> >
> > Bigger picture I wonder about migrating the `perf script` code to just
> > being regular python programs like the example here.
>
> You mean:
>
> acme@...ber:~/git/perf-tools-next$ ls -la tools/perf/scripts/python/
> total 452
> drwxr-xr-x. 1 acme acme 902 Aug 20 14:18 .
> drwxr-xr-x. 1 acme acme 30 Sep 17 09:27 ..
> -rwxr-xr-x. 1 acme acme 11865 Aug 20 14:06 arm-cs-trace-disasm.py
> drwxr-xr-x. 1 acme acme 1640 Aug 20 14:18 bin
> -rw-r--r--. 1 acme acme 2461 Apr 16 10:06 check-perf-trace.py
> -rw-r--r--. 1 acme acme 7923 Apr 16 10:06 compaction-times.py
> -rw-r--r--. 1 acme acme 7497 Apr 16 10:06 event_analyzing_sample.py
> -rwxr-xr-x. 1 acme acme 157369 Aug 20 14:18 exported-sql-viewer.py
> -rw-r--r--. 1 acme acme 39845 Apr 16 10:06 export-to-postgresql.py
> -rw-r--r--. 1 acme acme 24671 Apr 16 10:06 export-to-sqlite.py
> -rw-r--r--. 1 acme acme 2173 Apr 16 10:06 failed-syscalls-by-pid.py
> -rwxr-xr-x. 1 acme acme 10377 Aug 20 14:18 flamegraph.py
> -rw-r--r--. 1 acme acme 1717 Apr 16 10:06 futex-contention.py
> -rw-r--r--. 1 acme acme 13302 Apr 16 10:06 gecko.py
> -rw-r--r--. 1 acme acme 14636 Apr 16 10:06 intel-pt-events.py
> -rw-r--r--. 1 acme acme 3395 Apr 16 10:06 libxed.py
> -rw-r--r--. 1 acme acme 4230 Aug 20 14:13 mem-phys-addr.py
> -rw-r--r--. 1 acme acme 15420 Aug 20 14:05 netdev-times.py
> -rwxr-xr-x. 1 acme acme 1833 Apr 16 10:06 net_dropmonitor.py
> -rwxr-xr-x. 1 acme acme 30683 Aug 20 14:05 parallel-perf.py
> drwxr-xr-x. 1 acme acme 34 Sep 17 09:27 Perf-Trace-Util
> -rw-r--r--. 1 acme acme 4509 Apr 16 10:06 powerpc-hcalls.py
> -rw-r--r--. 1 acme acme 12095 Apr 16 10:06 sched-migration.py
> -rw-r--r--. 1 acme acme 2183 Apr 16 10:06 sctop.py
> -rwxr-xr-x. 1 acme acme 4408 Apr 16 10:06 stackcollapse.py
> -rw-r--r--. 1 acme acme 2444 Apr 16 10:06 stat-cpi.py
> -rw-r--r--. 1 acme acme 2055 Apr 16 10:06 syscall-counts-by-pid.py
> -rw-r--r--. 1 acme acme 1673 Apr 16 10:06 syscall-counts.py
> -rwxr-xr-x. 1 acme acme 34014 Apr 16 10:06 task-analyzer.py
> acme@...ber:~/git/perf-tools-next$
>
> And then make:
>
> acme@...ber:~/git/perf-tools-next$ perf script -l
> List of available trace scripts:
> compaction-times [-h] [-u] [-p|-pv] [-t | [-m] [-fs] [-ms]] [pid|pid-range|comm-regex] display time taken by mm compaction
> event_analyzing_sample analyze all perf samples
> export-to-postgresql [database name] [columns] [calls] export perf data to a postgresql database
> export-to-sqlite [database name] [columns] [calls] export perf data to a sqlite3 database
> failed-syscalls-by-pid [comm] system-wide failed syscalls, by pid
> flamegraph create flame graphs
> futex-contention futext contention measurement
> gecko create firefox gecko profile json format from perf.data
> intel-pt-events print Intel PT Events including Power Events and PTWRITE
> mem-phys-addr resolve physical address samples
> net_dropmonitor display a table of dropped frames
> netdev-times [tx] [rx] [dev=] [debug] display a process of packet and processing time
> powerpc-hcalls
> sched-migration sched migration overview
> sctop [comm] [interval] syscall top
> stackcollapse produce callgraphs in short form for scripting use
> syscall-counts-by-pid [comm] system-wide syscall counts, by pid
> syscall-counts [comm] system-wide syscall counts
> task-analyzer analyze timings of tasks
> failed-syscalls [comm] system-wide failed syscalls
> rw-by-file <comm> r/w activity for a program, by file
> rw-by-pid system-wide r/w activity
> rwtop [interval] system-wide r/w top
> wakeup-latency system-wide min/max/avg wakeup latency
> acme@...ber:~/git/perf-tools-next$
>
> And make:
>
> perf script rwtop
>
> Just call 'python PATH_TO_PYTHON_SCRIPTS/rwtop.py' transparently?
Yeah, that's it. The perf script libpython stuff is just providing
trace_begin, process_event and trace_end. Using the sampling
mechanism, as shown in Gautum's patches, it is pretty easy to migrate
them to being stand alone bits of python.
> That looks interesting indeed, that way we would stop linking with
> libpython, etc.
>
> I wonder if there are out of tree scripts using the current tools/perf/util/scripting-engines/trace-event-python.c
> mechanism...
>
> But even that can fallback to a python based mechanism, right?
I think so. Like I said about the use of a Dict for process_event, we
may want to streamline things so there is a tension with what the API
should be and compatibility. We can always have 2 APIs and try to
deprecate one of them.
> Import the script, if it has a given structure, use the new way, if not,
> call a glue that reads the events and feed to the old style code.
>
> Seems doable and would save code on the main perf binary and headaches
> with the libpython and libperl build processes.
So I see this for libpython, and I think it'd be pretty cool if we
could have things work like this for say "perf script ilist" and
Alice's textual flamegraph work. I worry what the work for libperl
would be like and whether it is worth it (hence sending the patch to
at least start to make it opt-in rather than opt-out).
Do you need my tags for these changes or wdyt about making the
kwlist/API surface smaller?
Thanks!
Ian
> - Arnaldo
>
> > I sent out
> > deprecating the libperl code to this ends (looking for reviews):
> > https://lore.kernel.org/linux-perf-users/20250908181918.3533480-1-irogers@google.com/
> > The issue is that `perf script` being the main thread inhibits things
> > like textual running until trace_end. This means we can't do things
> > like incremental loading support. We may want to make the perf events
> > support something like an asyncio interface for that.
> >
> > Refactoring that support will likely raise backward compatibility
> > concerns. It'd be a really nice thing to do as the API has some fairly
> > major overheads like turning everything in a sample into a Dict
> > whether needed or not:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/scripting-engines/trace-event-python.c#n838
> > I mention this just to say why I'd like to minimize the API when possible.
> >
> > Thanks,
> > Ian
> >
> > [1] https://lore.kernel.org/r/20250604174545.2853620-10-irogers@google.com
Powered by blists - more mailing lists