lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74d4c8e40e61dad369607ecd8b98f58a515479f0.camel@gmail.com>
Date: Fri, 07 Nov 2025 10:19:33 -0800
From: Eduard Zingerman <eddyz87@...il.com>
To: Donglin Peng <dolinux.peng@...il.com>, bot+bpf-ci@...nel.org, 
	ast@...nel.org, andrii.nakryiko@...il.com
Cc: zhangxiaoqin@...omi.com, linux-kernel@...r.kernel.org,
 bpf@...r.kernel.org, 	alan.maguire@...cle.com, song@...nel.org,
 pengdonglin@...omi.com, 	andrii@...nel.org, daniel@...earbox.net,
 martin.lau@...nel.org, 	yonghong.song@...ux.dev, clm@...a.com,
 ihor.solodrai@...ux.dev
Subject: Re: [PATCH v5 6/7] btf: Add lazy sorting validation for binary
 search

On Fri, 2025-11-07 at 15:08 +0800, Donglin Peng wrote:
> On Thu, Nov 6, 2025 at 9:47 PM <bot+bpf-ci@...nel.org> wrote:
> > 
> > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > index 66cb739a0..33c327d3c 100644
> > > --- a/kernel/bpf/btf.c
> > > +++ b/kernel/bpf/btf.c
> > > @@ -552,6 +552,70 @@ u32 btf_nr_types(const struct btf *btf)
> > >       return total;
> > >  }
> > > 
> > > +/* Verifies that BTF types are sorted in ascending order
> > > according to their
> > > + * names, with named types appearing before anonymous types. If
> > > the ordering
> > > + * is correct, counts the number of named types and updates the
> > > BTF object's
> > > + * nr_sorted_types field.
> > > + *
> > > + * Return: true if types are properly sorted, false otherwise
> > > + */
> > > +static bool btf_check_sorted(struct btf *btf)
> > > +{
> > > +     const struct btf_type *t;
> > > +     int i, n, k = 0, nr_sorted_types;
> > > +
> > > +     if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
> > > +             goto out;
> > > +     btf->nr_sorted_types = 0;
> >                             ^
> > 
> > Can multiple threads race when writing to btf->nr_sorted_types
> > here?
> > Looking at btf_find_by_name_kind()->btf_check_sorted(), I see that
> > btf_find_by_name_kind() receives a const pointer but casts away the
> > const
> > to call btf_check_sorted(). The function bpf_find_btf_id() calls
> > btf_find_by_name_kind() without holding any locks (line 737), and
> > later
> > explicitly unlocks before calling it again (lines 756-757).
> > 
> > This means multiple threads can concurrently enter
> > btf_check_sorted() and
> > write to btf->nr_sorted_types. While the validation logic is
> > idempotent
> > and all threads would compute the same value, the concurrent writes
> > to the
> > same memory location without synchronization could trigger KCSAN
> > warnings.
> > 
> > Should this use atomic operations, or should the validation be
> > performed
> > under a lock during BTF initialization before the BTF becomes
> > visible to
> > multiple threads?
> 
> Hi, is it necessary to address this issue?
> For example, by using atomic_try_cmpxchg or WRITE/READ_ONCE? Using
> atomic_try_cmpxchg can prevent race conditions on writes but requires
> an atomic
> variable, while WRITE_ONCE/READ_ONCE can avoid KCSAN warnings. Since
> the race condition is unlikely to cause critical issues, I suggest
> using
> WRITE_ONCE/READ_ONCE.

Probably use WRITE_ONCE/READ_ONCE?

> > > +
> > > +     if (btf->nr_types < 2)
> > > +             goto out;
> > > +
> > > +     nr_sorted_types = 0;
> > > +     n = btf_nr_types(btf) - 1;
> > > +     for (i = btf_start_id(btf); i < n; i++) {
> > > +             k = i + 1;
> > > +             if (btf_compare_type_names(&i, &k, btf) > 0)
> > > +                     goto out;
> > > +
> > > +             t = btf_type_by_id(btf, i);
> > > +             if (t->name_off)
> > > +                     nr_sorted_types++;
> > > +     }
> > > +
> > > +     t = btf_type_by_id(btf, k);
> > > +     if (t->name_off)
> > > +             nr_sorted_types++;
> > > +     if (nr_sorted_types)
> > > +             btf->nr_sorted_types = nr_sorted_types;
> >                              ^
> > 
> > The same race condition applies to this write of nr_sorted_types as
> > well.
> > 
> > > +
> > > +out:
> > > +     return btf->nr_sorted_types > 0;
> > > +}
> > 
> > [ ... ]
> > 
> > > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct btf
> > > *btf, const char *name, u8 kind)
> > >                       goto out;
> > >       }
> > > 
> > > -     if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > > +     if (btf_check_sorted((struct btf *)btf)) {
> >                                   ^
> > 
> > The const cast here enables the concurrent writes discussed above.
> > Is
> > there a reason to mark the btf parameter as const if we're
> > modifying it?
> 
> Hi team, is casting away const an acceptable approach for our
> codebase?

Casting away const is undefined behaviour, e.g. see paragraph 6.7.3.6
N1570 ISO/IEC 9899:201x Programming languages — C.

Both of the problems above can be avoided if kernel will do sorted
check non-lazily. But Andrii and Alexei seem to like that property.

> 
> > 
> > 
> > ---
> > AI reviewed your patch. Please fix the bug or email reply why it's
> > not a bug.
> > See:
> > https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> > 
> > CI run summary:
> > https://github.com/kernel-patches/bpf/actions/runs/19137195500

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ