[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <886b17f3dd9f9836dd87b9249a01e737088fe52174b8d513ad6350c6bb19ff87@mail.kernel.org>
Date: Sun, 9 Nov 2025 16:58:15 +0000 (UTC)
From: bot+bpf-ci@...nel.org
To: chen.dylane@...ux.dev,peterz@...radead.org,mingo@...hat.com,acme@...nel.org,namhyung@...nel.org,mark.rutland@....com,alexander.shishkin@...ux.intel.com,jolsa@...nel.org,irogers@...gle.com,adrian.hunter@...el.com,kan.liang@...ux.intel.com
Cc: linux-perf-users@...r.kernel.org,linux-kernel@...r.kernel.org,bpf@...r.kernel.org,chen.dylane@...ux.dev,ast@...nel.org,andrii@...nel.org,daniel@...earbox.net,martin.lau@...nel.org,eddyz87@...il.com,yonghong.song@...ux.dev,clm@...a.com,ihor.solodrai@...ux.dev
Subject: Re: [PATCH bpf-next v5 1/3] perf: Refactor get_perf_callchain
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index fd1d91017..edd3058e4 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -67,6 +67,7 @@ struct perf_callchain_entry_ctx {
> u32 nr;
> short contexts;
> bool contexts_maxed;
> + bool add_mark;
> };
[ ... ]
> diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
> index 808c0d7a3..fb1f26be2 100644
> --- a/kernel/events/callchain.c
> +++ b/kernel/events/callchain.c
[ ... ]
> +void __get_perf_callchain_user(struct perf_callchain_entry_ctx *ctx, struct pt_regs *regs)
> +{
> + int start_entry_idx;
> +
> + if (!user_mode(regs)) {
> + if (current->flags & (PF_KTHREAD | PF_USER_WORKER))
> + return;
> + regs = task_pt_regs(current);
> + }
> +
> + if (ctx->add_mark)
> + perf_callchain_store_context(ctx, PERF_CONTEXT_USER);
> +
> + start_entry_idx = ctx->nr;
^^^^^^^^
> + perf_callchain_user(ctx, regs);
> + fixup_uretprobe_trampoline_entries(ctx->entry, start_entry_idx);
> +}
Can this break uprobe trampoline fixups? The original code used
entry->nr for start_entry_idx, but the refactored version uses ctx->nr.
These two counters track different things:
- entry->nr counts ALL entries (incremented by both
perf_callchain_store_context() and perf_callchain_store())
- ctx->nr only counts regular callchain entries (incremented only by
perf_callchain_store(), not by perf_callchain_store_context())
When add_mark is true, perf_callchain_store_context() increments
entry->nr but not ctx->nr, causing them to become out of sync.
For example, if kernel callchain collection runs first with add_mark
set:
Initial: ctx->nr = entry->nr = 0
In __get_perf_callchain_kernel():
perf_callchain_store_context() stores PERF_CONTEXT_KERNEL marker
-> entry->nr becomes 1, ctx->nr still 0
perf_callchain_kernel() stores N kernel frames
-> entry->nr becomes 1+N, ctx->nr becomes N
In __get_perf_callchain_user():
perf_callchain_store_context() stores PERF_CONTEXT_USER marker
-> entry->nr becomes 2+N, ctx->nr still N
start_entry_idx = ctx->nr assigns N (should be 2+N)
fixup_uretprobe_trampoline_entries() scans from wrong index
This causes fixup_uretprobe_trampoline_entries() to scan kernel
callchain entries instead of user callchain entries, missing the
uretprobe trampolines that need to be fixed up.
Should this use ctx->entry->nr instead of ctx->nr?
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19211481431
Powered by blists - more mailing lists