lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fa45f7dd-1df9-4928-bca0-0398b0a07eea@oracle.com>
Date: Wed, 15 Oct 2025 10:15:02 +0100
From: Alan Maguire <alan.maguire@...cle.com>
To: Donglin Peng <dolinux.peng@...il.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-trace-kernel <linux-trace-kernel@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Eduard Zingerman <eddyz87@...il.com>,
        Alexei Starovoitov <ast@...nel.org>, Song Liu <song@...nel.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Steven Rostedt
 <rostedt@...dmis.org>,
        pengdonglin <pengdonglin@...omi.com>
Subject: Re: [RFC PATCH v1] btf: Sort BTF types by name and kind to optimize
 btf_find_by_name_kind lookup

On 15/10/2025 04:43, Donglin Peng wrote:
> On Wed, Oct 15, 2025 at 9:54 AM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
>>
>> On Mon, Oct 13, 2025 at 9:53 PM Donglin Peng <dolinux.peng@...il.com> wrote:
>>>
>>> I’d like to suggest a dual-mechanism approach:
>>> 1. If BTF is generated by a newer pahole (with pre-sorting support), the
>>>     kernel would use the pre-sorted data directly.
>>> 2. For BTF from older pahole versions, the kernel would handle sorting
>>>     at load time or later.
>>
>> The problem with 2 is extra memory consumption for narrow
>> use case. The "time cat trace" example shows that search
>> is in critical path, but I suspect ftrace can do it differently.
>> I don't know why it's doing the search so much.
> 
> Thanks. The reason is that ftrace supports outputting parameters of traced
> functions through funcgraph-args, like this:
> 
>  0)                    |  vfs_write(file=0xffff888102b17380,
> buf=0x7ffd1e9faaf7, count=0x1, pos=0xffffc90006f83ef0) {
>  0)                    |    rw_verify_area(read_write=1,
> file=0xffff888102b17380, ppos=0xffffc90006f83ef0, count=0x1) {
>  0)                    |
> security_file_permission(file=0xffff888102b17380, mask=2) {
>  0)                    |
> selinux_file_permission(file=0xffff888102b17380, mask=2) {
>  0)   0.111 us    |          avc_policy_seqno();
>  0)   0.380 us    |        }
>  0)   0.585 us    |      }
>  0)   0.782 us    |    }
> 
> which requires obtaining function parameter names and types from BTF.
> However, there is currently no direct mapping from function addresses to
> btf_type index information. Therefore, it first obtains the function name from
> the function address, and then searches the BTF file by the function name
> to get the corresponding btf_type.

The problem here is we have a lookup every time we collect function
args, right? Binary search of sorted function names will make that
better but it will still be slow if it has to happen every time we dump
function args. Would it make sense then perhaps to have a more tailored
solution like a cache of BTF type ids for functions that could be mapped
directly from kallsyms symbols? Mentioned this before [1] but maybe we
could figure something out now?

For example, we have to look up kallsym name for the address via
lookup_symbol_name(); it uses get_symbol_pos() internally to find the
index within the kallsyms_offsets array. If we had a similar array for
kallsyms_btf_ids we could use the same index to populate it with
function BTF ids, we could later do O(1) lookup. We would just need a
kallsyms lookup that returned the index, or indeed a new API which
returned the name and the BTF id (if we added such an index to kallsyms
code directly). We could even just populate the entries on first use and
then it would function as a cache. We would need a module+btf_id in the
index, so 64 bits per entry to support both module and kernel BTF. Seems
possible though?

[1]
https://lore.kernel.org/linux-trace-kernel/8455bc79-a684-476d-88bd-9f7ff9ffa637@oracle.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ