lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <abd75aed-9ff2-4e6d-8fec-2b118264efa9@linux.dev>
Date: Sat, 18 Oct 2025 15:51:22 +0800
From: Tao Chen <chen.dylane@...ux.dev>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>,
 Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Jiri Olsa <olsajiri@...il.com>, Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
 Kan Liang <kan.liang@...ux.intel.com>, Song Liu <song@...nel.org>,
 Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
 <martin.lau@...ux.dev>, Eduard <eddyz87@...il.com>,
 Yonghong Song <yonghong.song@...ux.dev>,
 John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
 Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
 "linux-perf-use." <linux-perf-users@...r.kernel.org>,
 LKML <linux-kernel@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [RFC PATCH bpf-next v2 2/2] bpf: Pass external callchain entry to
 get_perf_callchain

在 2025/10/17 04:39, Andrii Nakryiko 写道:
> On Tue, Oct 14, 2025 at 8:02 AM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
>>
>> On Tue, Oct 14, 2025 at 5:14 AM Jiri Olsa <olsajiri@...il.com> wrote:
>>>
>>> On Tue, Oct 14, 2025 at 06:01:28PM +0800, Tao Chen wrote:
>>>> As Alexei noted, get_perf_callchain() return values may be reused
>>>> if a task is preempted after the BPF program enters migrate disable
>>>> mode. Drawing on the per-cpu design of bpf_perf_callchain_entries,
>>>> stack-allocated memory of bpf_perf_callchain_entry is used here.
>>>>
>>>> Signed-off-by: Tao Chen <chen.dylane@...ux.dev>
>>>> ---
>>>>   kernel/bpf/stackmap.c | 19 +++++++++++--------
>>>>   1 file changed, 11 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>>>> index 94e46b7f340..acd72c021c0 100644
>>>> --- a/kernel/bpf/stackmap.c
>>>> +++ b/kernel/bpf/stackmap.c
>>>> @@ -31,6 +31,11 @@ struct bpf_stack_map {
>>>>        struct stack_map_bucket *buckets[] __counted_by(n_buckets);
>>>>   };
>>>>
>>>> +struct bpf_perf_callchain_entry {
>>>> +     u64 nr;
>>>> +     u64 ip[PERF_MAX_STACK_DEPTH];
>>>> +};
>>>> +
> 
> we shouldn't introduce another type, there is perf_callchain_entry in
> linux/perf_event.h, what's the problem with using that?

perf_callchain_entry uses flexible array, DEFINE_PER_CPU seems do not
create buffer for this, for ease of use, the size of the ip array has 
been explicitly defined.

struct perf_callchain_entry {
         u64                             nr;
         u64                             ip[]; /* 
/proc/sys/kernel/perf_event_max_stack */
};

> 
>>>>   static inline bool stack_map_use_build_id(struct bpf_map *map)
>>>>   {
>>>>        return (map->map_flags & BPF_F_STACK_BUILD_ID);
>>>> @@ -305,6 +310,7 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
>>>>        bool user = flags & BPF_F_USER_STACK;
>>>>        struct perf_callchain_entry *trace;
>>>>        bool kernel = !user;
>>>> +     struct bpf_perf_callchain_entry entry = { 0 };
>>>
>>> so IIUC having entries on stack we do not need to do preempt_disable
>>> you had in the previous version, right?
>>>
>>> I saw Andrii's justification to have this on the stack, I think it's
>>> fine, but does it have to be initialized? it seems that only used
>>> entries are copied to map
>>
>> No. We're not adding 1k stack consumption.
> 
> Right, and I thought we concluded as much last time, so it's a bit
> surprising to see this in this patch.
> 

Ok, I feel like I'm missing some context from our previous exchange.

> Tao, you should go with 3 entries per CPU used in a stack-like
> fashion. And then passing that entry into get_perf_callchain() (to
> avoid one extra copy).
>

Got it. It is more clearer, will change it in v3.

>>
>> pw-bot: cr


-- 
Best Regards
Tao Chen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ