lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c53734d-9185-46b7-b07d-bf24ac06e688@bytedance.com>
Date: Thu, 19 Dec 2024 20:38:16 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: Yonghong Song <yonghong.song@...ux.dev>,
 Martin KaFai Lau <martin.lau@...ux.dev>, Alexei Starovoitov
 <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Andrii Nakryiko <andrii@...nel.org>, Eduard Zingerman <eddyz87@...il.com>,
 Song Liu <song@...nel.org>, John Fastabend <john.fastabend@...il.com>,
 KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>,
 Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
 David Vernet <void@...ifault.com>
Cc: "open list:BPF [STORAGE & CGROUPS]" <bpf@...r.kernel.org>,
 open list <linux-kernel@...r.kernel.org>
Subject: Re: Re: [PATCH bpf] bpf: Fix deadlock when freeing cgroup storage

Hi Yonghong,

On 12/19/24 10:45 AM, Yonghong Song Wrote:
> 
> 
> 
> On 12/18/24 1:21 AM, Abel Wu wrote:
>> The following commit
>> bc235cdb423a ("bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]")
>> first introduced deadlock prevention for fentry/fexit programs attaching
>> on bpf_task_storage helpers. That commit also employed the logic in map
>> free path in its v6 version.
>>
>> Later bpf_cgrp_storage was first introduced in
>> c4bcfb38a95e ("bpf: Implement cgroup storage available to non-cgroup-attached bpf progs")
>> which faces the same issue as bpf_task_storage, instead of its busy
>> counter, NULL was passed to bpf_local_storage_map_free() which opened
>> a window to cause deadlock:
>>
>>     <TASK>
>>     _raw_spin_lock_irqsave+0x3d/0x50
>>     bpf_local_storage_update+0xd1/0x460
>>     bpf_cgrp_storage_get+0x109/0x130
>>     bpf_prog_72026450ec387477_cgrp_ptr+0x38/0x5e
>>     bpf_trace_run1+0x84/0x100
>>     cgroup_storage_ptr+0x4c/0x60
>>     bpf_selem_unlink_storage_nolock.constprop.0+0x135/0x160
>>     bpf_selem_unlink_storage+0x6f/0x110
>>     bpf_local_storage_map_free+0xa2/0x110
>>     bpf_map_free_deferred+0x5b/0x90
>>     process_one_work+0x17c/0x390
>>     worker_thread+0x251/0x360
>>     kthread+0xd2/0x100
>>     ret_from_fork+0x34/0x50
>>     ret_from_fork_asm+0x1a/0x30
>>     </TASK>
>>
>>     [ Since the verifier treats 'void *' as scalar which
>>       prevents me from getting a pointer to 'struct cgroup *',
>>       I added a raw tracepoint in cgroup_storage_ptr() to
>>       help reproducing this issue. ]
>>
>> Although it is tricky to reproduce, the risk of deadlock exists and
>> worthy of a fix, by passing its busy counter to the free procedure so
>> it can be properly incremented before storage/smap locking.
> 
> The above stack trace and explanation does not show that we will have
> a potential dead lock here. You mentioned that it is tricky to reproduce,
> does it mean that you have done some analysis or coding to reproduce it?
> Could you share the details on why you think we may have deadlock here?

The stack is A-A deadlocked: cgroup_storage_ptr() is called with
storage->lock held, while the bpf_prog attaching on this function
also tries to acquire the same lock by calling bpf_cgrp_storage_get()
thus leading to a AA deadlock.

The tricky part is, instead of attaching on cgroup_storage_ptr()
directly, I added a tracepoint inside it to hook:

------
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index 20f05de92e9c..679209d4f88f 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -40,6 +40,8 @@ static struct bpf_local_storage __rcu **cgroup_storage_ptr(void *owner)
  {
         struct cgroup *cg = owner;

+       trace_cgroup_ptr(cg);
+
         return &cg->bpf_cgrp_storage;
  }

------

The reason doing so is that typecasting from 'void *owner' to
'struct cgroup *' will be rejected by the verifier. But there
could be other ways to obtain a pointer to the @owner cgroup
too, making the deadlock possible.

Thanks,
	Abel

> 
>>
>> Fixes: c4bcfb38a95e ("bpf: Implement cgroup storage available to non-cgroup-attached bpf progs")
>> Signed-off-by: Abel Wu <wuyun.abel@...edance.com>
>> ---
>>   kernel/bpf/bpf_cgrp_storage.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
>> index 20f05de92e9c..7996fcea3755 100644
>> --- a/kernel/bpf/bpf_cgrp_storage.c
>> +++ b/kernel/bpf/bpf_cgrp_storage.c
>> @@ -154,7 +154,7 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
>>   static void cgroup_storage_map_free(struct bpf_map *map)
>>   {
>> -    bpf_local_storage_map_free(map, &cgroup_cache, NULL);
>> +    bpf_local_storage_map_free(map, &cgroup_cache, &bpf_cgrp_storage_busy);
>>   }
>>   /* *gfp_flags* is a hidden argument provided by the verifier */
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ