linux-kernel - Re: [PATCH bpf-next v6 2/2] selftests/bpf: Add test to verify freeing the special fields when update [lru_,]percpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <17a9444f-1151-44b4-b2ec-5cafd12bf2bd@linux.dev>
Date: Thu, 13 Nov 2025 21:16:32 +0800
From: Leon Hwang <leon.hwang@...ux.dev>
To: Yonghong Song <yonghong.song@...ux.dev>, bpf@...r.kernel.org
Cc: ast@...nel.org, andrii@...nel.org, daniel@...earbox.net,
 martin.lau@...ux.dev, eddyz87@...il.com, song@...nel.org,
 john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
 haoluo@...gle.com, jolsa@...nel.org, memxor@...il.com, ameryhung@...il.com,
 linux-kernel@...r.kernel.org, kernel-patches-bot@...com
Subject: Re: [PATCH bpf-next v6 2/2] selftests/bpf: Add test to verify freeing
 the special fields when update [lru_,]percpu_hash maps



On 2025/11/12 05:58, Yonghong Song wrote:
>
>
> On 11/11/25 5:52 AM, Leon Hwang wrote:
>>
>> On 2025/11/11 21:38, Leon Hwang wrote:
>>>
>>> On 2025/11/7 10:00, Yonghong Song wrote:
>>>>
>>>> On 11/5/25 7:14 AM, Leon Hwang wrote:
[...]

>>>>> +
>>>>> +static int __insert_in_list(struct bpf_list_head *head, struct
>>>>> bpf_spin_lock *lock,
>>>>> +                struct node_data __kptr **node)
>>>>> +{
>>>>> +    struct node_data *node_new, *node_ref, *node_old;
>>>>> +
>>>>> +    node_new = bpf_obj_new(typeof(*node_new));
>>>>> +    if (!node_new)
>>>>> +        return -1;
>>>>> +
>>>>> +    node_ref = bpf_refcount_acquire(node_new);
>>>>> +    node_old = bpf_kptr_xchg(node, node_new);
>>>> Change the above to node_old = bpf_kptr_xchg(node, node_node_ref);
>>>> might
>>>> be better for reasoning although node_ref/node_new are the same.
>>>>
>>> Nope — node_ref and node_new are different for the verifier.
>> They are the same in theory.
>>
>> The verifier failure was likely caused by something else, but I'm not
>> sure of the exact reason.
>
> I did some analysis and your code works as expected:
>
>     node_ref = bpf_refcount_acquire(node_new);
>     node_old = bpf_kptr_xchg(node, node_new);
>     if (node_old) {
>             bpf_obj_drop(node_old);
>             bpf_obj_drop(node_ref);
>             return -2;
>     }
>
>     bpf_spin_lock(lock);
>     bpf_list_push_front(head, &node_ref->l);
>     ref = (u64)(void *) &node_ref->ref;
>     bpf_spin_unlock(lock);
>
> In the above, after the following insn:
>     node_old = bpf_kptr_xchg(node, node_new);
> the second argument 'node_new' will become a scalar since it
> may be changed by another bpf program accessing the same map.
>
> So your code is okay as node_ref still valid ptr_node_data
> and can be used in following codes.
>
>
> My suggestion to replace
>     node_old = bpf_kptr_xchg(node, node_new);
> with
>     node_old = bpf_kptr_xchg(node, node_ref);
> will not work since node_ref will be a scalar
> so subsequent bpf_obj_drop(node_ref) and bpf_list_push_front(...)
> won't work.
>
> In summary, your change look okay to me. Sorry for noise.
>

Hi Yonghong,

Thanks for your detailed analysis.

Indeed, in the case of
node_old = bpf_kptr_xchg(node, node_ref);,
the verifier marks node_ref as SCALAR_VALUE.

Here's the relevant part in check_helper_call():

static int check_helper_call(struct bpf_verifier_env *env, struct
bpf_insn *insn,
                             int *insn_idx_p)
{
        //...
        if (meta.release_regno) {
                err = -EINVAL;
                if (arg_type_is_dynptr(fn->arg_type[meta.release_regno -
BPF_REG_1])) {
                        err = unmark_stack_slots_dynptr(env,
&regs[meta.release_regno]);
                } else if (func_id == BPF_FUNC_kptr_xchg &&
meta.ref_obj_id) {
                        u32 ref_obj_id = meta.ref_obj_id;
                        bool in_rcu = in_rcu_cs(env);
                        struct bpf_func_state *state;
                        struct bpf_reg_state *reg;

                        err = release_reference_nomark(env->cur_state,
ref_obj_id);
                        if (!err) {

bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({
                                        if (reg->ref_obj_id == ref_obj_id) {
                                                if (in_rcu && (reg->type
& MEM_ALLOC) && (reg->type & MEM_PERCPU)) {
                                                        reg->ref_obj_id = 0;
                                                        reg->type &=
~MEM_ALLOC;
                                                        reg->type |=
MEM_RCU;
                                                } else {

mark_reg_invalid(env, reg);
                                                }
                                        }
                                }));
                        }
                } else if (meta.ref_obj_id) {
                        //...
        }
        //...
}

This logic matches your explanation — when the argument passed to
bpf_kptr_xchg() holds a reference (meta.ref_obj_id), that reference is
not a KPTR_PERCPU inside an RCU critical section.
As a result, node_ref is invalidated and becomes a scalar after the call.

So yes, your reasoning is correct, and this behavior explains why using
node_ref as the second argument doesn't work.

Thanks,
Leon

[...]