[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <d57f3e256038e115f7d82b4e6b26d8da80d3c8d8afb4f0c627e0b435dee7eaf6@mail.kernel.org>
Date: Thu, 6 Nov 2025 13:47:42 +0000 (UTC)
From: bot+bpf-ci@...nel.org
To: dolinux.peng@...il.com,ast@...nel.org
Cc: eddyz87@...il.com,andrii.nakryiko@...il.com,zhangxiaoqin@...omi.com,linux-kernel@...r.kernel.org,bpf@...r.kernel.org,dolinux.peng@...il.com,alan.maguire@...cle.com,song@...nel.org,pengdonglin@...omi.com,ast@...nel.org,andrii@...nel.org,daniel@...earbox.net,martin.lau@...nel.org,eddyz87@...il.com,yonghong.song@...ux.dev,clm@...a.com,ihor.solodrai@...ux.dev
Subject: Re: [PATCH v5 6/7] btf: Add lazy sorting validation for binary search
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 66cb739a0..33c327d3c 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -552,6 +552,70 @@ u32 btf_nr_types(const struct btf *btf)
> return total;
> }
>
> +/* Verifies that BTF types are sorted in ascending order according to their
> + * names, with named types appearing before anonymous types. If the ordering
> + * is correct, counts the number of named types and updates the BTF object's
> + * nr_sorted_types field.
> + *
> + * Return: true if types are properly sorted, false otherwise
> + */
> +static bool btf_check_sorted(struct btf *btf)
> +{
> + const struct btf_type *t;
> + int i, n, k = 0, nr_sorted_types;
> +
> + if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
> + goto out;
> + btf->nr_sorted_types = 0;
^
Can multiple threads race when writing to btf->nr_sorted_types here?
Looking at btf_find_by_name_kind()->btf_check_sorted(), I see that
btf_find_by_name_kind() receives a const pointer but casts away the const
to call btf_check_sorted(). The function bpf_find_btf_id() calls
btf_find_by_name_kind() without holding any locks (line 737), and later
explicitly unlocks before calling it again (lines 756-757).
This means multiple threads can concurrently enter btf_check_sorted() and
write to btf->nr_sorted_types. While the validation logic is idempotent
and all threads would compute the same value, the concurrent writes to the
same memory location without synchronization could trigger KCSAN warnings.
Should this use atomic operations, or should the validation be performed
under a lock during BTF initialization before the BTF becomes visible to
multiple threads?
> +
> + if (btf->nr_types < 2)
> + goto out;
> +
> + nr_sorted_types = 0;
> + n = btf_nr_types(btf) - 1;
> + for (i = btf_start_id(btf); i < n; i++) {
> + k = i + 1;
> + if (btf_compare_type_names(&i, &k, btf) > 0)
> + goto out;
> +
> + t = btf_type_by_id(btf, i);
> + if (t->name_off)
> + nr_sorted_types++;
> + }
> +
> + t = btf_type_by_id(btf, k);
> + if (t->name_off)
> + nr_sorted_types++;
> + if (nr_sorted_types)
> + btf->nr_sorted_types = nr_sorted_types;
^
The same race condition applies to this write of nr_sorted_types as well.
> +
> +out:
> + return btf->nr_sorted_types > 0;
> +}
[ ... ]
> @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> goto out;
> }
>
> - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> + if (btf_check_sorted((struct btf *)btf)) {
^
The const cast here enables the concurrent writes discussed above. Is
there a reason to mark the btf parameter as const if we're modifying it?
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19137195500
Powered by blists - more mailing lists