[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzbAt_3co0s-+DspnHuJryG2DKPLP9OwsN0bWWnbd5zsmQ@mail.gmail.com>
Date: Thu, 16 Oct 2025 13:39:20 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Jiri Olsa <olsajiri@...il.com>, Tao Chen <chen.dylane@...ux.dev>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
Song Liu <song@...nel.org>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, Eduard <eddyz87@...il.com>,
Yonghong Song <yonghong.song@...ux.dev>, John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
"linux-perf-use." <linux-perf-users@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
bpf <bpf@...r.kernel.org>
Subject: Re: [RFC PATCH bpf-next v2 2/2] bpf: Pass external callchain entry to get_perf_callchain
On Tue, Oct 14, 2025 at 8:02 AM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Tue, Oct 14, 2025 at 5:14 AM Jiri Olsa <olsajiri@...il.com> wrote:
> >
> > On Tue, Oct 14, 2025 at 06:01:28PM +0800, Tao Chen wrote:
> > > As Alexei noted, get_perf_callchain() return values may be reused
> > > if a task is preempted after the BPF program enters migrate disable
> > > mode. Drawing on the per-cpu design of bpf_perf_callchain_entries,
> > > stack-allocated memory of bpf_perf_callchain_entry is used here.
> > >
> > > Signed-off-by: Tao Chen <chen.dylane@...ux.dev>
> > > ---
> > > kernel/bpf/stackmap.c | 19 +++++++++++--------
> > > 1 file changed, 11 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> > > index 94e46b7f340..acd72c021c0 100644
> > > --- a/kernel/bpf/stackmap.c
> > > +++ b/kernel/bpf/stackmap.c
> > > @@ -31,6 +31,11 @@ struct bpf_stack_map {
> > > struct stack_map_bucket *buckets[] __counted_by(n_buckets);
> > > };
> > >
> > > +struct bpf_perf_callchain_entry {
> > > + u64 nr;
> > > + u64 ip[PERF_MAX_STACK_DEPTH];
> > > +};
> > > +
we shouldn't introduce another type, there is perf_callchain_entry in
linux/perf_event.h, what's the problem with using that?
> > > static inline bool stack_map_use_build_id(struct bpf_map *map)
> > > {
> > > return (map->map_flags & BPF_F_STACK_BUILD_ID);
> > > @@ -305,6 +310,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
> > > bool user = flags & BPF_F_USER_STACK;
> > > struct perf_callchain_entry *trace;
> > > bool kernel = !user;
> > > + struct bpf_perf_callchain_entry entry = { 0 };
> >
> > so IIUC having entries on stack we do not need to do preempt_disable
> > you had in the previous version, right?
> >
> > I saw Andrii's justification to have this on the stack, I think it's
> > fine, but does it have to be initialized? it seems that only used
> > entries are copied to map
>
> No. We're not adding 1k stack consumption.
Right, and I thought we concluded as much last time, so it's a bit
surprising to see this in this patch.
Tao, you should go with 3 entries per CPU used in a stack-like
fashion. And then passing that entry into get_perf_callchain() (to
avoid one extra copy).
>
> pw-bot: cr
Powered by blists - more mailing lists