netdev - Re: [PATCH bpf-next v3 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <fe0e4007f3dac99d114d50ed365885795aa2c22f648e72b57c5ff7e8762f57ce@mail.kernel.org>
Date: Thu, 18 Dec 2025 18:27:58 +0000 (UTC)
From: bot+bpf-ci@...nel.org
To: ameryhung@...il.com,bpf@...r.kernel.org
Cc: netdev@...r.kernel.org,alexei.starovoitov@...il.com,andrii@...nel.org,daniel@...earbox.net,memxor@...il.com,martin.lau@...nel.org,kpsingh@...nel.org,yonghong.song@...ux.dev,song@...nel.org,haoluo@...gle.com,ameryhung@...il.com,kernel-team@...a.com,ast@...nel.org,andrii@...nel.org,daniel@...earbox.net,martin.lau@...nel.org,eddyz87@...il.com,yonghong.song@...ux.dev,clm@...a.com,ihor.solodrai@...ux.dev
Subject: Re: [PATCH bpf-next v3 05/16] bpf: Change local_storage->lock and b->lock to rqspinlock

> diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> index fa629a180e9e..1d21ec11c80e 100644
> --- a/kernel/bpf/bpf_local_storage.c
> +++ b/kernel/bpf/bpf_local_storage.c

[ ... ]

> @@ -382,7 +390,10 @@ int bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool reuse_now)
>  	local_storage = rcu_dereference_check(selem->local_storage,
>  					      bpf_rcu_lock_held());
>
> -	raw_spin_lock_irqsave(&local_storage->lock, flags);
> +	err = raw_res_spin_lock_irqsave(&local_storage->lock, flags);
> +	if (err)
> +		return err;

This change makes bpf_selem_unlink() failable, but bpf_local_storage_map_free()
calls it in a while loop that continues iterating while hlist_first_rcu()
returns elements:

    while ((selem = hlist_entry_safe(
            rcu_dereference_raw(hlist_first_rcu(&b->list)), ...))) {
        WARN_ON(bpf_selem_unlink(selem, true));
    }

If bpf_selem_unlink() fails (returns error), the selem is not removed from
b->list, so hlist_first_rcu() returns the same selem again. Would this cause
an infinite loop printing warnings until the system becomes unresponsive?

[ ... ]

> @@ -719,7 +743,7 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
>  	 * when unlinking elem from the local_storage->list and
>  	 * the map's bucket->list.
>  	 */
> -	raw_spin_lock_irqsave(&local_storage->lock, flags);
> +	WARN_ON(raw_res_spin_lock_irqsave(&local_storage->lock, flags));
>  	hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) {
>  		/* Always unlink from map before unlinking from
>  		 * local_storage.
>  		 */
> -		bpf_selem_unlink_map(selem);
> +		WARN_ON(bpf_selem_unlink_map(selem));

If raw_res_spin_lock_irqsave() fails, WARN_ON() prints a warning but does
not prevent execution from continuing. The code then proceeds to iterate
local_storage->list without holding local_storage->lock, and eventually
calls raw_res_spin_unlock_irqrestore() at the end without having acquired
the lock. Does this cause issues with the rqspinlock held-lock tracking,
which unconditionally decrements its counter in res_spin_unlock()?

Additionally, if bpf_selem_unlink_map() fails, the selem remains linked to
the map's bucket list, but execution continues and
bpf_selem_unlink_storage_nolock() removes it from local_storage->list and
adds it to the free list. When the selem is later freed, would the map's
bucket still hold a dangling reference to it?

> @@ -734,7 +758,7 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage)
>  		free_storage = bpf_selem_unlink_storage_nolock(
>  			local_storage, selem, &free_selem_list);
>  	}
> -	raw_spin_unlock_irqrestore(&local_storage->lock, flags);
> +	raw_res_spin_unlock_irqrestore(&local_storage->lock, flags);

[ ... ]


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20346662777