[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBuAjt5PyBsN65R9@x1>
Date: Wed, 7 May 2025 12:47:26 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Namhyung Kim <namhyung@...nel.org>, Ingo Molnar <mingo@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Kan Liang <kan.liang@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-perf-users@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC/PATCH] perf report: Support latency profiling in
system-wide mode
On Wed, May 07, 2025 at 11:58:10AM +0200, Dmitry Vyukov wrote:
> On Tue, 6 May 2025 at 18:58, Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > > So maybe when we notice that cycles was used 'perf report/top' should
> > > use the term 'wall-clock' for the column name?
> > > So instead of having a --latency we could have a hint that would tell
> > > the user that if this knob was set:
> > > $ perf config report.wall-clock=true
> I am not sure it can be as simple as a single global knob.
> First, record needs to collect additional info and that may be
> somewhat expensive.
> Second, report now has several modes:
> - it can show latency, but order by the current overhead
> - it can also show latency, and order by latency
> A user wants one or another depending on what they are optimizing
> (possibly looking at both).
> I also feel that global config switches are even less discoverable by
> median users (read -- ~nobody will know about that). If these things
> are normal flags with a help description, then some users may
> eventually discover them.
So, the addition of:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9cbc854d8b148e3491291fb615e94261970fb54
perf config: Add a function to set one variable in .perfconfig
To allow for setting a variable from some other tool, like with the
"wallclock" patchset needs to allow the user to opt-in to having
that key in the sort order for 'perf report'.
Cc: Dmitriy Vyukov <dvyukov@...gle.com>
Was to tell the user about the new possibilities of profiling, once,
with something like:
root@...ber:~# perf report
┌─────────────────────────────────────────────────────┐
│Do you want to use latency mode? │
│That will add a 'Latency' column that would mean │
│'wall-clock' time when used with cycles, for instance│
│ │
│ │
│Enter: Yes, ESC: No │
└─────────────────────────────────────────────────────┘
The working should inform if the current perf.data has what is needed
and thus this would take immediate effect, and if not, inform that as
well and what is needed to be able to use that, things like you
described above:
> First, record needs to collect additional info and that may be
> somewhat expensive.
If the user says that the feature is desired by pressing Enter, follow
up questions can be asked as well, like you described:
> Second, report now has several modes:
> - it can show latency, but order by the current overhead
> - it can also show latency, and order by latency
> A user wants one or another depending on what they are optimizing
> (possibly looking at both).
Once these are answered, the questions goes away, as they are now
recorded in the user's ~/.perfconfig file and will be used when what is
needed is present in perf.data file.
The user should also be informed that to change that knob, 'perf config'
is your friend.
Additionally some hotkey could be made available to change that
behaviour, on the fly, with the option of turning the new mode the new
default, again writing to ~/.perfconfig.
Sometimes this "on the fly" needs reprocessing of the whole perf.data
file, sometimes it may be a matter of just showing extra available
fields, like the 'V' key now, that goes on bumping the verbosity in the
'perf report' and 'perf top' browsers (they share the hists browser), or
'j' that toogles showing the lines from jump sources to jump targets in
the annotate browser, 'J' to show how many jumps target a particular
jump target, also in the annotate browser, etc.
This could help experimenting with the various modes of doing profiling,
interactively, to see the problem from different angles, without having
to restart the whole process by exiting the 'perf report' to add new
command line options, then restart, etc, reusing computation sometimes
when switching views, etc.
To get default behaviour its just a matter of renaming ~/.perfconfig to
~/.perfconfig.some_suitabe_name_for_later_use.
This way interesting new features like this don't get buried behind
either command line options nor man pages, giving the user at least an
opportunity to be informed about it.
That is what the patch below barely starts sketching, not doing all I
just described.
- Arnaldo
> > > Then if 'cycles' is present we would have:
> > Something along the lines of the patch below, but there are several
> > details to take into account with what is in the current codebase, only
> > if what is needed for doing the latency/wall-clock time is present in
> > the perf.data being used is present that knob would be used or suggested
> > to the user, so some refactorings are needed, I'll try to folow on it.
> > But just for reference:
> > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> > index f0299c7ee0254a37..20874800d967ffb5 100644
> > --- a/tools/perf/builtin-report.c
> > +++ b/tools/perf/builtin-report.c
> > @@ -51,6 +51,7 @@
> > #include "util/util.h" // perf_tip()
> > #include "ui/ui.h"
> > #include "ui/progress.h"
> > +#include "ui/util.h"
> > #include "util/block-info.h"
> >
> > #include <dlfcn.h>
> > @@ -127,6 +128,11 @@ static int report__config(const char *var, const char *value, void *cb)
> > {
> > struct report *rep = cb;
> >
> > + if (!strcmp(var, "report.prefer_latency")) {
> > + symbol_conf.prefer_latency = perf_config_bool(var, value);
> > + symbol_conf.prefer_latency_set = true;
> > + return 0;
> > + }
> > if (!strcmp(var, "report.group")) {
> > symbol_conf.event_group = perf_config_bool(var, value);
> > return 0;
> > @@ -1734,6 +1740,15 @@ int cmd_report(int argc, const char **argv)
> > symbol_conf.annotate_data_sample = true;
> > }
> >
> > + if (!symbol_conf.prefer_latency) {
> > + if (ui__dialog_yesno("Do you want to use latency mode?\n"
> > + "That will add a 'Latency' column that would mean\n"
> > + "'wall-clock' time when used with cycles, for instance\n")) {
> > + symbol_conf.prefer_latency = symbol_conf.prefer_latency_set = true;
> > + perf_config__set_variable("report.prefer_latency", "yes");
> > + }
> > + }
> > +
> > symbol_conf.enable_latency = true;
> > if (report.disable_order || !perf_session__has_switch_events(session)) {
> > if (symbol_conf.parallelism_list_str ||
> > diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
> > index cd9aa82c7d5ad941..f87071f5dedd94ca 100644
> > --- a/tools/perf/util/symbol_conf.h
> > +++ b/tools/perf/util/symbol_conf.h
> > @@ -51,6 +51,7 @@ struct symbol_conf {
> > annotate_data_sample,
> > skip_empty,
> > enable_latency,
> > + prefer_latency_set,
> > prefer_latency;
> > const char *vmlinux_name,
> > *kallsyms_name,
Powered by blists - more mailing lists