lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e449a1ad-0efb-d65c-5dae-c146a5a5511a@huaweicloud.com>
Date: Sun, 25 Jun 2023 19:20:22 +0800
From: Hou Tao <houtao@...weicloud.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>, daniel@...earbox.net,
 andrii@...nel.org, void@...ifault.com, paulmck@...nel.org
Cc: tj@...nel.org, rcu@...r.kernel.org, netdev@...r.kernel.org,
 bpf@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH v2 bpf-next 12/13] bpf: Introduce bpf_mem_free_rcu()
 similar to kfree_rcu().

Hi,

On 6/24/2023 2:49 PM, Hou Tao wrote:
> Hi,
>
> On 6/24/2023 11:13 AM, Alexei Starovoitov wrote:
>> From: Alexei Starovoitov <ast@...nel.org>
>>
> SNIP
>>  
>> +static void __free_by_rcu(struct rcu_head *head)
>> +{
>> +	struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu);
>> +	struct bpf_mem_cache *tgt = c->tgt;
>> +	struct llist_node *llnode;
>> +
>> +	if (unlikely(READ_ONCE(c->draining)))
>> +		goto out;
>> +
>> +	llnode = llist_del_all(&c->waiting_for_gp);
>> +	if (!llnode)
>> +		goto out;
>> +
>> +	if (llist_add_batch(llnode, c->waiting_for_gp_tail, &tgt->free_by_rcu_ttrace))
>> +		tgt->free_by_rcu_ttrace_tail = c->waiting_for_gp_tail;
> Got a null-ptr dereference oops when running multiple test_maps and
> htab-mem benchmark after hacking htab to use bpf_mem_cache_free_rcu().
> And I think it happened as follow:
The same null-ptr dereference happened even before switching htab to use
bpf_mem_cache_free_rcu(). So it seems after introducing c->tgt, there
could be two concurrent free_bulk() but with the same c->tgt.
>
> // c->tgt
> P1: __free_by_rcu()
>         // c->tgt is the same as P1
>         P2: __free_by_rcu()
>
> // return true
> P1: llist_add_batch(&tgt->free_by_rcu_ttrace)
>         // return false
>         P2: llist_add_batch(&tgt->free_by_rcu_ttrace)
>         P2: do_call_rcu_ttrace
>         // return false
>         P2: xchg(tgt->call_rcu_ttrace_in_progress, 1)
>         // llnode is not NULL
>         P2: llnode = llist_del_all(&c->free_by_rcu_ttrace)
>         // BAD: c->free_by_rcu_ttrace_tail is NULL, so oops
>         P2: __llist_add_batch(llnode, c->free_by_rcu_ttrace_tail)
>
> P1: tgt->free_by_rcu_ttrace_tail = X   
>
> I don't have a good fix for the problem except adding a spin-lock for
> free_by_rcu_ttrace and free_by_rcu_ttrace_tail.
>
>
> .


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ