lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b51ca47f-46e2-457e-a152-2f7fbdeee1e2@linux.dev>
Date: Fri, 6 Feb 2026 15:25:06 -0800
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Amery Hung <ameryhung@...il.com>
Cc: netdev@...r.kernel.org, alexei.starovoitov@...il.com, andrii@...nel.org,
 daniel@...earbox.net, memxor@...il.com, martin.lau@...nel.org,
 kpsingh@...nel.org, yonghong.song@...ux.dev, song@...nel.org,
 haoluo@...gle.com, bpf@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH bpf-next v7 10/17] bpf: Support lockless unlink when
 freeing map or local storage

On 2/5/26 2:29 PM, Amery Hung wrote:
> +/*
> + * Unlink an selem from map and local storage with lockless fallback if callers
> + * are racing or rqspinlock returns error. It should only be called by
> + * bpf_local_storage_destroy() or bpf_local_storage_map_free().
> + */
> +static void bpf_selem_unlink_nofail(struct bpf_local_storage_elem *selem,
> +				    struct bpf_local_storage_map_bucket *b)
> +{
> +	bool in_map_free = !!b, free_storage = false;
> +	struct bpf_local_storage *local_storage;
> +	struct bpf_local_storage_map *smap;
> +	unsigned long flags;
> +	int err, unlink = 0;
> +
> +	local_storage = rcu_dereference_check(selem->local_storage, bpf_rcu_lock_held());
> +	smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held());
> +
> +	if (smap) {
> +		b = b ? : select_bucket(smap, local_storage);
> +		err = raw_res_spin_lock_irqsave(&b->lock, flags);
> +		if (!err) {
> +			/*
> +			 * Call bpf_obj_free_fields() under b->lock to make sure it is done
> +			 * exactly once for an selem. Safe to free special fields immediately
> +			 * as no BPF program should be referencing the selem.
> +			 */
> +			if (likely(selem_linked_to_map(selem))) {
> +				hlist_del_init_rcu(&selem->map_node);
> +				bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
> +				unlink++;
> +			}
> +			raw_res_spin_unlock_irqrestore(&b->lock, flags);
> +		}
> +		/*
> +		 * Highly unlikely scenario: resource leak
> +		 *
> +		 * When map_free(selem1), destroy(selem1) and destroy(selem2) are racing
> +		 * and both selem belong to the same bucket, if destroy(selem2) acquired
> +		 * b->lock and block for too long, neither map_free(selem1) and
> +		 * destroy(selem1) will be able to free the special field associated
> +		 * with selem1 as raw_res_spin_lock_irqsave() returns -ETIMEDOUT.
> +		 */
> +		WARN_ON_ONCE(err && in_map_free);
> +		if (!err || in_map_free)
> +			RCU_INIT_POINTER(SDATA(selem)->smap, NULL);
> +	}
> +
> +	if (local_storage) {
> +		err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
> +		if (!err) {
> +			if (likely(selem_linked_to_storage(selem))) {
> +				free_storage = hlist_is_singular_node(&selem->snode,
> +								      &local_storage->list);
> +				 /*
> +				  * Okay to skip clearing owner_storage and storage->owner in
> +				  * destroy() since the owner is going away. No user or bpf
> +				  * programs should be able to reference it.
> +				  */
> +				if (smap && in_map_free)
> +					bpf_selem_unlink_storage_nolock_misc(
> +						selem, smap, local_storage,
> +						free_storage, true);
> +				hlist_del_init_rcu(&selem->snode);
> +				unlink++;
> +			}
> +			raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);
> +		}
> +		if (!err || !in_map_free)
> +			RCU_INIT_POINTER(selem->local_storage, NULL);
> +	}
> +
> +	if (unlink != 2)
> +		atomic_or(in_map_free ? SELEM_MAP_UNLINKED : SELEM_STORAGE_UNLINKED, &selem->state);
> +
> +	/*
> +	 * Normally, an selem can be unlinked under local_storage->lock and b->lock, and
> +	 * then freed after an RCU grace period. However, if destroy() and map_free() are
> +	 * racing or rqspinlock returns errors in unlikely situations (unlink != 2), free
> +	 * the selem only after both map_free() and destroy() see the selem.
> +	 */
> +	if (unlink == 2 ||
> +	    atomic_cmpxchg(&selem->state, SELEM_UNLINKED, SELEM_TOFREE) == SELEM_UNLINKED)
> +		bpf_selem_free(selem, true);
> +
> +	if (free_storage)
> +		bpf_local_storage_free(local_storage, true);

I think there is a chance that selem->state reached SELEM_UNLINKED but 
free_storage is false, and then local_storage is leaked.

afaik, it can happen when destroy() cannot hold its own 
local_storage->lock, but it should be very unlikely. There is a similar 
WARN_ON_ONCE in this function. If addressing this unlikely case is not 
worth the complexity, maybe it deserves a WARN_ON_ONCE here also. This 
can be followed up.

Thanks for working on this. It is a huge effort. The set is applied.

> +}


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ