[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9ebb8b72-4b73-4cf5-9054-9134daf16d0d@linux.dev>
Date: Mon, 10 Nov 2025 17:31:19 +0800
From: Tao Chen <chen.dylane@...ux.dev>
To: bot+bpf-ci@...nel.org, peterz@...radead.org, mingo@...hat.com,
acme@...nel.org, namhyung@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org, irogers@...gle.com,
adrian.hunter@...el.com, kan.liang@...ux.intel.com
Cc: linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org, ast@...nel.org, andrii@...nel.org,
daniel@...earbox.net, martin.lau@...nel.org, eddyz87@...il.com,
yonghong.song@...ux.dev, clm@...a.com, ihor.solodrai@...ux.dev
Subject: Re: [PATCH bpf-next v5 1/3] perf: Refactor get_perf_callchain
在 2025/11/10 00:58, bot+bpf-ci@...nel.org 写道:
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index fd1d91017..edd3058e4 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -67,6 +67,7 @@ struct perf_callchain_entry_ctx {
>> u32 nr;
>> short contexts;
>> bool contexts_maxed;
>> + bool add_mark;
>> };
>
> [ ... ]
>
>> diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
>> index 808c0d7a3..fb1f26be2 100644
>> --- a/kernel/events/callchain.c
>> +++ b/kernel/events/callchain.c
>
> [ ... ]
>
>> +void __get_perf_callchain_user(struct perf_callchain_entry_ctx *ctx, struct pt_regs *regs)
>> +{
>> + int start_entry_idx;
>> +
>> + if (!user_mode(regs)) {
>> + if (current->flags & (PF_KTHREAD | PF_USER_WORKER))
>> + return;
>> + regs = task_pt_regs(current);
>> + }
>> +
>> + if (ctx->add_mark)
>> + perf_callchain_store_context(ctx, PERF_CONTEXT_USER);
>> +
>> + start_entry_idx = ctx->nr;
> ^^^^^^^^
>> + perf_callchain_user(ctx, regs);
>> + fixup_uretprobe_trampoline_entries(ctx->entry, start_entry_idx);
>> +}
>
> Can this break uprobe trampoline fixups? The original code used
> entry->nr for start_entry_idx, but the refactored version uses ctx->nr.
>
> These two counters track different things:
> - entry->nr counts ALL entries (incremented by both
> perf_callchain_store_context() and perf_callchain_store())
> - ctx->nr only counts regular callchain entries (incremented only by
> perf_callchain_store(), not by perf_callchain_store_context())
>
> When add_mark is true, perf_callchain_store_context() increments
> entry->nr but not ctx->nr, causing them to become out of sync.
>
> For example, if kernel callchain collection runs first with add_mark
> set:
>
> Initial: ctx->nr = entry->nr = 0
>
> In __get_perf_callchain_kernel():
> perf_callchain_store_context() stores PERF_CONTEXT_KERNEL marker
> -> entry->nr becomes 1, ctx->nr still 0
> perf_callchain_kernel() stores N kernel frames
> -> entry->nr becomes 1+N, ctx->nr becomes N
>
> In __get_perf_callchain_user():
> perf_callchain_store_context() stores PERF_CONTEXT_USER marker
> -> entry->nr becomes 2+N, ctx->nr still N
> start_entry_idx = ctx->nr assigns N (should be 2+N)
> fixup_uretprobe_trampoline_entries() scans from wrong index
>
> This causes fixup_uretprobe_trampoline_entries() to scan kernel
> callchain entries instead of user callchain entries, missing the
> uretprobe trampolines that need to be fixed up.
>
> Should this use ctx->entry->nr instead of ctx->nr?
>
Use ctx->entry->nr looks better, will change it.
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19211481431
--
Best Regards
Tao Chen
Powered by blists - more mailing lists