[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRd_xgPPlk9PiKm_@google.com>
Date: Fri, 14 Nov 2025 11:15:18 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
James Clark <james.clark@...aro.org>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
linux-perf-users@...r.kernel.org,
Steven Rostedt <rostedt@...dmis.org>,
Josh Poimboeuf <jpoimboe@...nel.org>,
Indu Bhagat <indu.bhagat@...cle.com>,
Jens Remus <jremus@...ux.ibm.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [PATCH v3 3/5] perf record: Enable defer_callchain for user
callchains
On Fri, Nov 14, 2025 at 10:12:34AM -0800, Ian Rogers wrote:
> On Fri, Nov 14, 2025 at 10:09 AM Ian Rogers <irogers@...gle.com> wrote:
> >
> > On Fri, Nov 14, 2025 at 9:59 AM Ian Rogers <irogers@...gle.com> wrote:
> > >
> > > On Thu, Nov 13, 2025 at 11:01 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > >
> > > > And add the missing feature detection logic to clear the flag on old
> > > > kernels.
> > > >
> > > > $ perf record -g -vv true
> > > > ...
> > > > ------------------------------------------------------------
> > > > perf_event_attr:
> > > > type 0 (PERF_TYPE_HARDWARE)
> > > > size 136
> > > > config 0 (PERF_COUNT_HW_CPU_CYCLES)
> > > > { sample_period, sample_freq } 4000
> > > > sample_type IP|TID|TIME|CALLCHAIN|PERIOD
> > > > read_format ID|LOST
> > > > disabled 1
> > > > inherit 1
> > > > mmap 1
> > > > comm 1
> > > > freq 1
> > > > enable_on_exec 1
> > > > task 1
> > > > sample_id_all 1
> > > > mmap2 1
> > > > comm_exec 1
> > > > ksymbol 1
> > > > bpf_event 1
> > > > defer_callchain 1
> > > > defer_output 1
> > > > ------------------------------------------------------------
> > > > sys_perf_event_open: pid 162755 cpu 0 group_fd -1 flags 0x8
> > > > sys_perf_event_open failed, error -22
> > > > switching off deferred callchain support
> > > >
> > > > Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> > > > ---
> > > > tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++
> > > > tools/perf/util/evsel.h | 1 +
> > > > 2 files changed, 25 insertions(+)
> > > >
> > > > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > > > index 244b3e44d090d413..f5652d00b457d096 100644
> > > > --- a/tools/perf/util/evsel.c
> > > > +++ b/tools/perf/util/evsel.c
> > > > @@ -1061,6 +1061,14 @@ static void __evsel__config_callchain(struct evsel *evsel, struct record_opts *o
> > > > }
> > > > }
> > > >
> > > > + if (param->record_mode == CALLCHAIN_FP && !attr->exclude_callchain_user) {
> > > > + /*
> > > > + * Enable deferred callchains optimistically. It'll be switched
> > > > + * off later if the kernel doesn't support it.
> > > > + */
> > > > + attr->defer_callchain = 1;
> > > > + }
> > >
> > > If a user has requested frame pointer call chains why would they want
> > > deferred call chains? The point of deferral to my understanding is to
> > > allow the paging in of debug data, but frame pointers don't need that
> > > as the stack should be in the page cache.
> > >
> > > Is this being done for code coverage reasons so that deferral is known
> > > to work for later addition of SFrames? In which case this should be an
> > > opt-in not default behavior. When there is a record_mode of
> > > CALLCHAIN_SFRAME then making deferral the default for that mode makes
> > > sense, but not for frame pointers IMO.
> >
> > Just to be clear. I don't think the behavior of using frame pointers
> > should change. Deferral has downsides, for example:
> >
> > $ perf record -g -a sleep 1
> >
> > Without deferral kernel stack traces will contain both kernel and user
> > traces. With deferral the user stack trace is only generated when the
> > system call returns and so there is a chance for kernel stack traces
> > to be missing their user part. An obvious behavioral change. I think
> > for what you are doing here we can have an option something like:
> >
> > $ perf record --call-graph fp-deferred -a sleep 1
> >
> > Which would need a man page update, etc. What is happening with the
> > other call-graph modes and deferral? Could the option be something
> > like `--call-graph fp,deferred` so that the option is a common one and
> > say stack snapshots for dwarf be somehow improved?
>
> Also, making deferral the norm will generate new perf events that
> tools, other than perf, processing perf.data files will fail to
> consume. So this change would break quite a lot of stuff, so it should
> not just be made the default.
Thanks a lot for your input! Yeah I agree it'd be better to make it
optional. Having separate `--call-graph fp,defer` sounds good. I can
add a config option to control deferred callchains as well.
Thanks,
Namhyung
Powered by blists - more mailing lists