lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bddc9f1d5c1f2f7f233707cf2af81a2013d46b7d.camel@gmail.com>
Date: Fri, 21 Nov 2025 11:07:16 -0800
From: Eduard Zingerman <eddyz87@...il.com>
To: Donglin Peng <dolinux.peng@...il.com>, Andrii Nakryiko
	 <andrii.nakryiko@...il.com>
Cc: ast@...nel.org, zhangxiaoqin@...omi.com, linux-kernel@...r.kernel.org, 
	bpf@...r.kernel.org, Donglin Peng <pengdonglin@...omi.com>, Alan Maguire
	 <alan.maguire@...cle.com>, Song Liu <song@...nel.org>
Subject: Re: [RFC PATCH v7 5/7] libbpf: Implement BTF type sorting
 validation for binary search optimization

On Thu, 2025-11-20 at 15:25 +0800, Donglin Peng wrote:
> On Thu, Nov 20, 2025 at 3:50 AM Andrii Nakryiko
> <andrii.nakryiko@...il.com> wrote:
> > 
> > On Tue, Nov 18, 2025 at 7:21 PM Donglin Peng <dolinux.peng@...il.com> wrote:
> > > 
> > > From: Donglin Peng <pengdonglin@...omi.com>
> > > 
> > > This patch adds validation to verify BTF type name sorting, enabling
> > > binary search optimization for lookups.
> > > 
> > > Cc: Eduard Zingerman <eddyz87@...il.com>
> > > Cc: Alexei Starovoitov <ast@...nel.org>
> > > Cc: Andrii Nakryiko <andrii.nakryiko@...il.com>
> > > Cc: Alan Maguire <alan.maguire@...cle.com>
> > > Cc: Song Liu <song@...nel.org>
> > > Cc: Xiaoqin Zhang <zhangxiaoqin@...omi.com>
> > > Signed-off-by: Donglin Peng <pengdonglin@...omi.com>
> > > ---
> > >  tools/lib/bpf/btf.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 59 insertions(+)
> > > 
> > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > index 1d19d95da1d0..d872abff42e1 100644
> > > --- a/tools/lib/bpf/btf.c
> > > +++ b/tools/lib/bpf/btf.c
> > > @@ -903,6 +903,64 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id)
> > >         return type_id;
> > >  }
> > > 
> > > +/* Anonymous types (with empty names) are considered greater than named types
> > > + * and are sorted after them. Two anonymous types are considered equal. Named
> > > + * types are compared lexicographically.
> > > + */
> > > +static int btf_compare_type_names(const void *a, const void *b, void *priv)
> > > +{
> > > +       struct btf *btf = (struct btf *)priv;
> > > +       struct btf_type *ta = btf_type_by_id(btf, *(__u32 *)a);
> > > +       struct btf_type *tb = btf_type_by_id(btf, *(__u32 *)b);
> > > +       const char *na, *nb;
> > > +       bool anon_a, anon_b;
> > > +
> > > +       na = btf__str_by_offset(btf, ta->name_off);
> > > +       nb = btf__str_by_offset(btf, tb->name_off);
> > > +       anon_a = str_is_empty(na);
> > > +       anon_b = str_is_empty(nb);
> > > +
> > > +       if (anon_a && !anon_b)
> > > +               return 1;
> > > +       if (!anon_a && anon_b)
> > > +               return -1;
> > > +       if (anon_a && anon_b)
> > > +               return 0;
> > 
> > any reason to hard-code that anonymous types should come *after* named
> > ones? That requires custom comparison logic here and resolve_btfids,
> > instead of just relying on btf__str_by_offset() returning valid empty
> > string for name_off == 0 and then sorting anon types before named
> > ones, following normal lexicographical sorting rules?
> 
> Thanks. I found that some kernel functions like btf_find_next_decl_tag,
> bpf_core_add_cands, find_bpffs_btf_enums, and find_btf_percpu_datasec
> still use linear search.

- btf_find_next_decl_tag() - this function is called from:
  - btf_find_decl_tag_value(), here whole scan over all BTF types is
    guaranteed to happen (because btf_find_next_decl_tag() is called
    twice);
  - btf_prepare_func_args(), here again whole scan is guaranteed to
    happen, because of the while loop starting from id == 0.
- bpf_core_add_cands() this function is called from
  bpf_core_find_cands(), where it does a linear scan over all types in
  kernel BTF and then a linear scan over all types in module BTFs.
  (Because of how targ_start_id parameter is passed).
- find_bpffs_btf_enums() - this function does a linear scan over all
  types in module BTFs.
- find_btf_percpu_datasec() - this function looks for a DATASEC with
  name ".data..percpu" and returns as soon as the match is found.

Of the 4 functions above only find_btf_percpu_datasec() will return
early if BTF type with specified name is found. And it can be
converted to use btf_find_by_name_kind().

So, it appears that there should not be any performance penalty
(compared to current state of affairs) if anonymous types are put in
front. Wdyt?

> Putting named types first would also help here, as
> it allows anonymous types to be skipped naturally during the search.
> Some of them could be refactored to use btf_find_by_name_kind, but some
> would not be appropriate, such as btf_find_next_decl_tag,
> bpf_core_add_cands,find_btf_percpu_datasec.

Did you observe any performance issue if anonymous types are put in
the front?

> Additionally, in the linear search branch, I saw there is a NULL check for
> the name returned by btf__name_by_offset. This suggests that checking
> name_off == 0 alone may not be sufficient to identify an anonymous type,
> which is why I used str_is_empty for a more robust check.
> 
> > 
> > [...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ