[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMB2axMMgGtYqE3+jbhN=T=GrGdN3DctP6jeR1t9M+4tLkeO0w@mail.gmail.com>
Date: Mon, 12 Jan 2026 09:47:08 -0800
From: Amery Hung <ameryhung@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org, alexei.starovoitov@...il.com,
andrii@...nel.org, daniel@...earbox.net, memxor@...il.com,
martin.lau@...nel.org, kpsingh@...nel.org, yonghong.song@...ux.dev,
song@...nel.org, haoluo@...gle.com, kernel-team@...a.com
Subject: Re: [PATCH bpf-next v3 01/16] bpf: Convert bpf_selem_unlink_map to failable
On Fri, Jan 9, 2026 at 1:53 PM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> On 1/9/26 10:39 AM, Amery Hung wrote:
> >>> @@ -574,20 +603,37 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
> >>> goto unlock;
> >>> }
> >>>
> >>> + b = select_bucket(smap, selem);
> >>> +
> >>> + if (old_sdata) {
> >>> + old_b = select_bucket(smap, SELEM(old_sdata));
> >>> + old_b = old_b == b ? NULL : old_b;
> >>> + }
> >>> +
> >>> + raw_spin_lock_irqsave(&b->lock, b_flags);
> >>> +
> >>> + if (old_b)
> >>> + raw_spin_lock_irqsave(&old_b->lock, old_b_flags);
> >> This will deadlock because of the lock ordering of b and old_b.
> >> Replacing it with res_spin_lock in the later patch can detect it and
> >> break it more gracefully. imo, we should not introduce a known deadlock
> >> logic in the kernel code in the syscall code path and ask the current
> >> user to retry the map_update_elem syscall.
> >>
> >> What happened to the patch in the earlier revision that uses the
> >> local_storage (or owner) for select_bucket?
> > Thanks for reviewing!
> >
> > I decided to revert it because this introduces the dependency of selem
> > to local_storage when unlinking. bpf_selem_unlink_lockless() cannot
> > assume map or local_storage associated with a selem to be alive. In
> > the case where local_storage is already destroyed, we won't be able to
> > figure out the bucket if select_bucket() uses local_storage for
> > hashing.
> >
> > A middle ground is to use local_storage for hashing, but save the
> > bucket index in selem so that local_storage pointer won't be needed
> > later. WDYT?
>
> I would try not to add another "const"-like value to selem if it does
> not have to. imo, it is quite wasteful considering the number of
> selem(s) that can live in the system. Yes, there is one final 8-byte
> hole in selem, but it still should not be used lightly unless nothing
> else can be shared. The atomic/u16/bool added in this set can be
> discussed later once patch 10 is concluded.
>
I see.
> For select_bucket in bpf_selem_unlink_lockless, map_free should know the
> bucket. destroy() should have the local_storage, no?
>
You are right. I can get the bucket from map_free(). I will use
local_storage for hashing.
Powered by blists - more mailing lists