netdev - Re: [PATCH bpf] bpf/tracing: fix a deadlock in perf_event_detach_bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <6e82d70e-8f81-13d6-9fe5-6029ab658537@fb.com>
Date:   Mon, 9 Apr 2018 17:51:25 -0700
From:   Alexei Starovoitov <ast@...com>
To:     Yonghong Song <yhs@...com>, <daniel@...earbox.net>,
        <netdev@...r.kernel.org>
CC:     <kernel-team@...com>
Subject: Re: [PATCH bpf] bpf/tracing: fix a deadlock in
 perf_event_detach_bpf_prog

On 4/9/18 11:41 AM, Yonghong Song wrote:
>
>
> On 4/9/18 9:47 AM, Alexei Starovoitov wrote:
>> On 4/9/18 9:18 AM, Yonghong Song wrote:
>>> syzbot reported a possible deadlock in perf_event_detach_bpf_prog.
>> ...
>>> @@ -985,16 +986,31 @@ int perf_event_query_prog_array(struct
>>> perf_event *event, void __user *info)
>>>          return -EINVAL;
>>>      if (copy_from_user(&query, uquery, sizeof(query)))
>>>          return -EFAULT;
>>> -    if (query.ids_len > BPF_TRACE_MAX_PROGS)
>>> +
>>> +    ids_len = query.ids_len;
>>> +    if (ids_len > BPF_TRACE_MAX_PROGS)
>>>          return -E2BIG;
>>> +    ids = kcalloc(ids_len, sizeof(u32), GFP_USER | __GFP_NOWARN);
>>> +    if (!ids)
>>> +        return -ENOMEM;
>>>
>>>      mutex_lock(&bpf_event_mutex);
>>>      ret = bpf_prog_array_copy_info(event->tp_event->prog_array,
>>> -                       uquery->ids,
>>> -                       query.ids_len,
>>> -                       &uquery->prog_cnt);
>>> +                       ids,
>>> +                       ids_len,
>>> +                       &prog_cnt);
>>>      mutex_unlock(&bpf_event_mutex);
>>>
>>> +    if (!ret || ret == -ENOSPC) {
>>> +        if (copy_to_user(&uquery->prog_cnt, &prog_cnt,
>>> sizeof(prog_cnt)) ||
>>> +            copy_to_user(uquery->ids, ids, ids_len * sizeof(u32))) {
>>> +            ret = -EFAULT;
>>> +            goto out;
>>> +        }
>>> +    }
>>> +
>>> +out:
>>> +    kfree(ids);
>>
>> alloc/free just to avoid this locking dependency feels suboptimal.
>
> We actually already did kcalloc/kfree in bpf_prog_array_copy_to_user.
> In that function, we did not copy_to_user one id at a time.
> We allocate a temporary array and store the result there
> and at the end, we call one copy_to_user to copy to the user buffer.
>
> The patch here just moved this allocation and associated copy_to_user
> out of the function and bpf_event_mutex. It did not introduce new
> allocations.

I see, so the patch essentially open coding
bpf_prog_array_copy_to_user()
can we share the code then?
bpf/core.c callsite used by trace/bpf_trace.c
and similar callsite in bpf/cgroup.c
should be using common helper.


>>
>> may be we can get rid of bpf_event_mutex in some cases?
>> the perf event itself is locked via perf_event_ctx_lock() when we're
>> calling perf_event_query_prog_array, perf_event_attach|detach_bpf_prog.
>> I forgot what was the motivation for us to introduce it in the
>> first place.
>
> The original motivation for the lock to make sure bpf_prog_array
> does not change during middle of attach/detach/query. it looks like
> we have:
>    . perf_event_attach|query under perf_event_ctx_lock
>    . perf_event_detach not under perf_event_ctx_lock
> Introducing perf_event_ctx_lock to perf_event_detach could still
> have the deadlock.

ahh, right, since the progs are in even->tp_event which can
be shared by multiple perf_events.
Scratch that idea.