[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f3e041d4-c65a-4c16-99ff-37caceebb54a@linux.dev>
Date: Fri, 9 Jan 2026 13:53:28 -0800
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Amery Hung <ameryhung@...il.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
alexei.starovoitov@...il.com, andrii@...nel.org, daniel@...earbox.net,
memxor@...il.com, martin.lau@...nel.org, kpsingh@...nel.org,
yonghong.song@...ux.dev, song@...nel.org, haoluo@...gle.com,
kernel-team@...a.com
Subject: Re: [PATCH bpf-next v3 01/16] bpf: Convert bpf_selem_unlink_map to
failable
On 1/9/26 10:39 AM, Amery Hung wrote:
>>> @@ -574,20 +603,37 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
>>> goto unlock;
>>> }
>>>
>>> + b = select_bucket(smap, selem);
>>> +
>>> + if (old_sdata) {
>>> + old_b = select_bucket(smap, SELEM(old_sdata));
>>> + old_b = old_b == b ? NULL : old_b;
>>> + }
>>> +
>>> + raw_spin_lock_irqsave(&b->lock, b_flags);
>>> +
>>> + if (old_b)
>>> + raw_spin_lock_irqsave(&old_b->lock, old_b_flags);
>> This will deadlock because of the lock ordering of b and old_b.
>> Replacing it with res_spin_lock in the later patch can detect it and
>> break it more gracefully. imo, we should not introduce a known deadlock
>> logic in the kernel code in the syscall code path and ask the current
>> user to retry the map_update_elem syscall.
>>
>> What happened to the patch in the earlier revision that uses the
>> local_storage (or owner) for select_bucket?
> Thanks for reviewing!
>
> I decided to revert it because this introduces the dependency of selem
> to local_storage when unlinking. bpf_selem_unlink_lockless() cannot
> assume map or local_storage associated with a selem to be alive. In
> the case where local_storage is already destroyed, we won't be able to
> figure out the bucket if select_bucket() uses local_storage for
> hashing.
>
> A middle ground is to use local_storage for hashing, but save the
> bucket index in selem so that local_storage pointer won't be needed
> later. WDYT?
I would try not to add another "const"-like value to selem if it does
not have to. imo, it is quite wasteful considering the number of
selem(s) that can live in the system. Yes, there is one final 8-byte
hole in selem, but it still should not be used lightly unless nothing
else can be shared. The atomic/u16/bool added in this set can be
discussed later once patch 10 is concluded.
For select_bucket in bpf_selem_unlink_lockless, map_free should know the
bucket. destroy() should have the local_storage, no?
Powered by blists - more mailing lists