[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fa45f7dd-1df9-4928-bca0-0398b0a07eea@oracle.com>
Date: Wed, 15 Oct 2025 10:15:02 +0100
From: Alan Maguire <alan.maguire@...cle.com>
To: Donglin Peng <dolinux.peng@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Andrii Nakryiko <andrii.nakryiko@...il.com>,
Andrii Nakryiko <andrii@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-trace-kernel <linux-trace-kernel@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, Eduard Zingerman <eddyz87@...il.com>,
Alexei Starovoitov <ast@...nel.org>, Song Liu <song@...nel.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Steven Rostedt
<rostedt@...dmis.org>,
pengdonglin <pengdonglin@...omi.com>
Subject: Re: [RFC PATCH v1] btf: Sort BTF types by name and kind to optimize
btf_find_by_name_kind lookup
On 15/10/2025 04:43, Donglin Peng wrote:
> On Wed, Oct 15, 2025 at 9:54 AM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
>>
>> On Mon, Oct 13, 2025 at 9:53 PM Donglin Peng <dolinux.peng@...il.com> wrote:
>>>
>>> I’d like to suggest a dual-mechanism approach:
>>> 1. If BTF is generated by a newer pahole (with pre-sorting support), the
>>> kernel would use the pre-sorted data directly.
>>> 2. For BTF from older pahole versions, the kernel would handle sorting
>>> at load time or later.
>>
>> The problem with 2 is extra memory consumption for narrow
>> use case. The "time cat trace" example shows that search
>> is in critical path, but I suspect ftrace can do it differently.
>> I don't know why it's doing the search so much.
>
> Thanks. The reason is that ftrace supports outputting parameters of traced
> functions through funcgraph-args, like this:
>
> 0) | vfs_write(file=0xffff888102b17380,
> buf=0x7ffd1e9faaf7, count=0x1, pos=0xffffc90006f83ef0) {
> 0) | rw_verify_area(read_write=1,
> file=0xffff888102b17380, ppos=0xffffc90006f83ef0, count=0x1) {
> 0) |
> security_file_permission(file=0xffff888102b17380, mask=2) {
> 0) |
> selinux_file_permission(file=0xffff888102b17380, mask=2) {
> 0) 0.111 us | avc_policy_seqno();
> 0) 0.380 us | }
> 0) 0.585 us | }
> 0) 0.782 us | }
>
> which requires obtaining function parameter names and types from BTF.
> However, there is currently no direct mapping from function addresses to
> btf_type index information. Therefore, it first obtains the function name from
> the function address, and then searches the BTF file by the function name
> to get the corresponding btf_type.
The problem here is we have a lookup every time we collect function
args, right? Binary search of sorted function names will make that
better but it will still be slow if it has to happen every time we dump
function args. Would it make sense then perhaps to have a more tailored
solution like a cache of BTF type ids for functions that could be mapped
directly from kallsyms symbols? Mentioned this before [1] but maybe we
could figure something out now?
For example, we have to look up kallsym name for the address via
lookup_symbol_name(); it uses get_symbol_pos() internally to find the
index within the kallsyms_offsets array. If we had a similar array for
kallsyms_btf_ids we could use the same index to populate it with
function BTF ids, we could later do O(1) lookup. We would just need a
kallsyms lookup that returned the index, or indeed a new API which
returned the name and the BTF id (if we added such an index to kallsyms
code directly). We could even just populate the entries on first use and
then it would function as a cache. We would need a module+btf_id in the
index, so 64 bits per entry to support both module and kernel BTF. Seems
possible though?
[1]
https://lore.kernel.org/linux-trace-kernel/8455bc79-a684-476d-88bd-9f7ff9ffa637@oracle.com/
Powered by blists - more mailing lists