[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMB2axMA_8t6M2RJ97Y-9BB_E3N0h=9ZKArazbNKS_ySuNkKAQ@mail.gmail.com>
Date: Sat, 31 Jan 2026 10:40:18 -0800
From: Amery Hung <ameryhung@...il.com>
To: bot+bpf-ci@...nel.org
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org, alexei.starovoitov@...il.com,
andrii@...nel.org, daniel@...earbox.net, memxor@...il.com,
martin.lau@...nel.org, kpsingh@...nel.org, yonghong.song@...ux.dev,
song@...nel.org, haoluo@...gle.com, kernel-team@...a.com, ast@...nel.org,
eddyz87@...il.com, clm@...a.com, ihor.solodrai@...ux.dev
Subject: Re: [PATCH bpf-next v4 01/16] bpf: Select bpf_local_storage_map_bucket
based on bpf_local_storage
On Fri, Jan 30, 2026 at 9:33 PM <bot+bpf-ci@...nel.org> wrote:
>
> > diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> > index e2fe6c328..6615091dd 100644
> > --- a/kernel/bpf/bpf_local_storage.c
> > +++ b/kernel/bpf/bpf_local_storage.c
>
> [ ... ]
>
> > @@ -368,9 +371,13 @@ static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
> > void bpf_selem_link_map(struct bpf_local_storage_map *smap,
> > struct bpf_local_storage_elem *selem)
> > {
> > - struct bpf_local_storage_map_bucket *b = select_bucket(smap, selem);
> > + struct bpf_local_storage *local_storage;
> > + struct bpf_local_storage_map_bucket *b;
> > unsigned long flags;
> >
> > + local_storage = rcu_dereference_check(selem->local_storage,
> > + bpf_rcu_lock_held());
> > + b = select_bucket(smap, local_storage);
>
> The commit message states: "This is safe since a selem is always linked to
> local_storage before linked to map."
>
> Is this claim accurate? Looking at bpf_local_storage_update(), the order is:
>
> /* First, link the new selem to the map */
> bpf_selem_link_map(smap, selem);
>
> /* Second, link (and publish) the new selem to local_storage */
> bpf_selem_link_storage_nolock(local_storage, selem);
>
> The comment explicitly says "First, link the new selem to the map" which is
> the opposite order from what the commit message claims.
>
> Similarly, in bpf_sk_storage_clone() when new_sk_storage is non-NULL:
>
> bpf_selem_link_map(smap, copy_selem);
> bpf_selem_link_storage_nolock(new_sk_storage, copy_selem);
>
> Since bpf_selem_alloc() does not initialize selem->local_storage (it only
> sets SDATA(selem)->smap), and bpf_selem_link_storage_nolock() is what sets
> selem->local_storage, wouldn't this mean bpf_selem_link_map() reads NULL
> from selem->local_storage?
>
> With local_storage being NULL, select_bucket() would compute hash_ptr(NULL)
> which returns 0, placing all newly linked selems into bucket 0. Later when
> bpf_selem_unlink_map() runs, selem->local_storage has been set, so it
> computes a different bucket and tries to unlink from the wrong bucket.
>
> The next commit in the series (82de4fd08 "bpf: Convert bpf_selem_unlink_map
> to failable") introduces bpf_selem_link_map_nolock() to fix this for
> bpf_local_storage_alloc() and bpf_local_storage_update(), but
> bpf_sk_storage_clone() still uses bpf_selem_link_map() and appears to
> remain affected throughout the series.
AI is right here. I will fix the bug. We should follow the
local_storage then map linking order in these three places.
>
> > raw_spin_lock_irqsave(&b->lock, flags);
> > hlist_add_head_rcu(&selem->map_node, &b->list);
> > raw_spin_unlock_irqrestore(&b->lock, flags);
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/21539340647
>
> AI-authorship-score: low
> AI-authorship-explanation: The commit has typical human characteristics: domain-specific BPF knowledge, concise explanations, and follows kernel conventions.
> issues-found: 1
> issue-severity-score: high
> issue-severity-explanation: Hash bucket mismatch causes selems to be inserted into wrong bucket and never properly unlinked, leading to hash list corruption.
Powered by blists - more mailing lists