[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ecb7df62-d707-43dd-943b-98a452b2268e@linux.dev>
Date: Wed, 11 Feb 2026 15:10:53 +0800
From: Tao Chen <chen.dylane@...ux.dev>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: song@...nel.org, jolsa@...nel.org, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, martin.lau@...ux.dev, eddyz87@...il.com,
yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org,
sdf@...ichev.me, haoluo@...gle.com, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v2 1/2] bpf: Add preempt disable for
bpf_get_stack
在 2026/2/7 01:12, Andrii Nakryiko 写道:
> On Fri, Feb 6, 2026 at 1:07 AM Tao Chen <chen.dylane@...ux.dev> wrote:
>>
>> The get_perf_callchain() return values may be reused if a task is preempted
>> after the BPF program enters migrate disable mode, so we should add
>> preempt_disable. And as Andrii suggested, BPF can guarantee perf callchain
>> buffer won't be released during use, for bpf_get_stack_id, BPF stack map
>> will keep them alive by delaying put_callchain_buffer() until freeing time
>> or for bpf_get_stack/bpf_get_task_stack, BPF program itself will hold these
>> buffers alive again, until freeing time which is delayed until after
>> RCU Tasks Trace + RCU grace period.
>>
>> Suggested-by: Andrii Nakryiko <andrii@...nel.org>
>> Signed-off-by: Tao Chen <chen.dylane@...ux.dev>
>> ---
>>
>> Change list:
>> - v1 -> v2
>> - add preempt_disable for bpf_get_stack in patch1
>> - add patch2
>> - v1: https://lore.kernel.org/bpf/20260128165710.928294-1-chen.dylane@linux.dev
>>
>> kernel/bpf/stackmap.c | 13 ++++++-------
>> 1 file changed, 6 insertions(+), 7 deletions(-)
>>
>
> Hm... looking at bpf_get_stack_pe(), I'm not sure what's the exact
> guarantees around that ctx->data->callchain that we pass as
> trace_in... It looks like it's the same temporary per-cpu callchain as
> in other places, just attached (temporarily) to ctx. So we probably
> want preemption disabled/enabled for that one as well, no? And to
see commit "1d7bf6b7d3e8" (perf/bpf: Remove preempt disable around BPF
invocation)
bpf_overflow_handler is called from NMI or at least hard
interrupt context which is already non-preemptible. So no preemption
disabled needed.
> achieve that, I think we'll need to split out build_id logic out of
> __bpf_get_stack() and do it after preemption is enabled in the
> callers. Luckily it's not that much of a code and logic, should be
> easy. But please analyze this carefully yourself.
>
> pw-bot: cr
>
>
>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>> index da3d328f5c1..1b100a03ef2 100644
>> --- a/kernel/bpf/stackmap.c
>> +++ b/kernel/bpf/stackmap.c
>> @@ -460,8 +460,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
>>
>> max_depth = stack_map_calculate_max_depth(size, elem_size, flags);
>>
>> - if (may_fault)
>> - rcu_read_lock(); /* need RCU for perf's callchain below */
>> + if (!trace_in)
>> + preempt_disable();
>>
>> if (trace_in) {
>> trace = trace_in;
>> @@ -474,8 +474,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
>> }
>>
>> if (unlikely(!trace) || trace->nr < skip) {
>> - if (may_fault)
>> - rcu_read_unlock();
>> + if (!trace_in)
>> + preempt_enable();
>> goto err_fault;
>> }
>>
>> @@ -493,9 +493,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
>> memcpy(buf, ips, copy_len);
>> }
>>
>> - /* trace/ips should not be dereferenced after this point */
>> - if (may_fault)
>> - rcu_read_unlock();
>> + if (!trace_in)
>> + preempt_enable();
>>
>> if (user_build_id)
>> stack_map_get_build_id_offset(buf, trace_nr, user, may_fault);
>> --
>> 2.48.1
>>
--
Best Regards
Tao Chen
Powered by blists - more mailing lists