[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cd6155cf-f1e0-4d6b-98af-a53c4999c5a3@linux.dev>
Date: Thu, 31 Jul 2025 18:05:45 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Amery Hung <ameryhung@...il.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
alexei.starovoitov@...il.com, andrii@...nel.org, daniel@...earbox.net,
memxor@...il.com, kpsingh@...nel.org, martin.lau@...nel.org,
yonghong.song@...ux.dev, song@...nel.org, haoluo@...gle.com,
kernel-team@...a.com
Subject: Re: [RFC PATCH bpf-next v1 01/11] bpf: Convert bpf_selem_unlink_map
to failable
On 7/29/25 11:25 AM, Amery Hung wrote:
> - bpf_local_storage_update()
>
> The three step update process: link_map(new_selem),
> link_storage(new_selem), and unlink_map(old_selem) should not fail in
> the middle. Hence, lock both b->lock before the update process starts.
>
> While locking two different buckets decided by the hash function
> introduces different locking order, this will not cause ABBA deadlock
> since this is performed under local_storage->lock.
I am not sure it is always true. e.g. two threads running in different cores can
do bpf_local_storage_update() for two different sk, then it will be two
different local_storage->lock.
My current thought is to change the select_bucket() to depend on the owner
pointer (e.g. *sk, *task...) or the local_storage pointer instead. The intuitive
thinking is the owner pointer is easier to understand than the local_storage
pointer. Then the same owner always hash to the same bucket of a map.
I am not sure the owner pointer is always available in the current setup during
delete. This needs to check. iirc, the current setup is that local_storage->lock
and bucket lock are not always acquired together. It seems the patch set now
needs to acquire both of them together if possible. With this, I suspect
something else can be simplified here and also make the owner pointer available
during delete (if it is indeed missing in some cases now). Not very sure yet. I
need a bit more time to take a closer look.
Thanks for working on this! I think it can simplify the local storage.
[ ... ]
> @@ -560,8 +595,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
> struct bpf_local_storage_data *old_sdata = NULL;
> struct bpf_local_storage_elem *alloc_selem, *selem = NULL;
> struct bpf_local_storage *local_storage;
> + struct bpf_local_storage_map_bucket *b, *old_b;
> HLIST_HEAD(old_selem_free_list);
> - unsigned long flags;
> + unsigned long flags, b_flags, old_b_flags;
> int err;
>
> /* BPF_EXIST and BPF_NOEXIST cannot be both set */
> @@ -645,20 +681,31 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
> goto unlock;
> }
>
> + b = select_bucket(smap, selem);
> + old_b = old_sdata ? select_bucket(smap, SELEM(old_sdata)) : b;
> +
> + raw_spin_lock_irqsave(&b->lock, b_flags);
> + if (b != old_b)
> + raw_spin_lock_irqsave(&old_b->lock, old_b_flags);
Powered by blists - more mailing lists