[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fb745675-e25c-4dcc-be4b-4a4411056755@linux.dev>
Date: Wed, 28 Jan 2026 10:41:40 +0800
From: Tao Chen <chen.dylane@...ux.dev>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: bot+bpf-ci@...nel.org, peterz@...radead.org, mingo@...hat.com,
acme@...nel.org, namhyung@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org, irogers@...gle.com,
adrian.hunter@...el.com, kan.liang@...ux.intel.com, song@...nel.org,
ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
martin.lau@...ux.dev, eddyz87@...il.com, yonghong.song@...ux.dev,
john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
haoluo@...gle.com, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org, martin.lau@...nel.org,
clm@...a.com, ihor.solodrai@...ux.dev
Subject: Re: [PATCH bpf-next v8 1/3] perf: Add rctx in perf_callchain_entry
在 2026/1/28 05:01, Andrii Nakryiko 写道:
> On Mon, Jan 26, 2026 at 12:51 AM Tao Chen <chen.dylane@...ux.dev> wrote:
>>
>> 在 2026/1/26 16:03, bot+bpf-ci@...nel.org 写道:
>>>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>>>> index 9ded2e582c60..0330f265173d 100644
>>>> --- a/include/linux/perf_event.h
>>>> +++ b/include/linux/perf_event.h
>>>> @@ -57,6 +57,7 @@
>>>> #include <asm/local.h>
>>>>
>>>> struct perf_callchain_entry {
>>>> + int rctx;
>>>> u64 nr;
>>>> u64 ip[]; /* /proc/sys/kernel/perf_event_max_stack */
>>>> };
>>>
>>> Adding rctx before nr changes the structure layout in a way that
>>> appears to break the perf userspace ABI.
>>>
>>> In perf_output_sample() in kernel/events/core.c, when PERF_SAMPLE_CALLCHAIN
>>> is set, the code does:
>>>
>>> int size = 1;
>>> size += data->callchain->nr;
>>> size *= sizeof(u64);
>>> __output_copy(handle, data->callchain, size);
>>>
>>> This copies (1 + nr) * 8 bytes starting from the beginning of the
>>> structure. Before this patch, offset 0 contained nr (u64), followed
>>> by ip[]. After this patch, offset 0 contains rctx (int) + 4 bytes
>>> padding, then nr at offset 8, then ip[] at offset 16.
>>>
>>> With the new layout, this copy sends the following to userspace:
>>> - First 8 bytes: rctx + padding (kernel-internal data) instead of nr
>>> - Remaining bytes: nr + ip[0..nr-2] instead of ip[0..nr-1]
>>>
>>> This means the callchain data sent to userspace is shifted by 8 bytes,
>>> the last IP is truncated, and the rctx value is leaked to userspace.
>>>
>>> Is there a plan to update perf_output_sample() to output starting from
>>> &entry->nr instead of the structure start? Or am I missing something
>>> about how this structure is used for userspace output?
>>>
>>
>> As AI said, this change shifts the memory layout, which effectively
>> breaks the userspace ABI.
>>
>> Maybe we can use __output_copy(handle, &data->callchain->nr, size);
>
> yep, very impressive for AI to notice this. I agree that
> &data->callchain->nr seems like the best way forward.
>
will fix it in v9.
>>
>>> [ ... ]
>>>
>>>
>>> ---
>>> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
>>> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>>>
>>> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21350234831
>>
>>
>> --
>> Best Regards
>> Tao Chen
--
Best Regards
Tao Chen
Powered by blists - more mailing lists