[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <34E0EF7C-A247-4C6D-9C16-CDD29F20C7EF@fb.com>
Date: Sun, 12 Jul 2020 06:34:31 +0000
From: Song Liu <songliubraving@...com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
CC: open list <linux-kernel@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, Networking <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <Kernel-team@...com>,
john fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...omium.org>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Peter Ziljstra <peterz@...radead.org>
Subject: Re: [PATCH bpf-next 1/5] bpf: block bpf_get_[stack|stackid] on
perf_event with PEBS entries
> On Jul 11, 2020, at 10:06 PM, Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:
>
> On Fri, Jul 10, 2020 at 11:28 PM Song Liu <songliubraving@...com> wrote:
>>
>>
>>
>>> On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:
>>>
>>> On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@...com> wrote:
>>>>
>>>> Calling get_perf_callchain() on perf_events from PEBS entries may cause
>>>> unwinder errors. To fix this issue, the callchain is fetched early. Such
>>>> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.
>>>>
>>>> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
>>>> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on
>>>> these perf_events. Unfortunately, bpf verifier cannot tell whether the
>>>> program will be attached to perf_event with PEBS entries. Therefore,
>>>> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF).
>>>>
>>>> Signed-off-by: Song Liu <songliubraving@...com>
>>>> ---
>>>
>>> Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid
>>> can't figure out automatically that they are called from
>>> __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain,
>>> if necessary?
>>>
>>> It is quite suboptimal from a user experience point of view to require
>>> two different BPF helpers depending on PEBS or non-PEBS perf events.
>>
>> I am not aware of an easy way to tell the difference in bpf_get_stack.
>> But I do agree that would be much better.
>>
>
> Hm... Looking a bit more how all this is tied together in the kernel,
> I think it's actually quite easy. So, for perf_event BPF program type:
>
> 1. return a special prototype for bpf_get_stack/bpf_get_stackid, which
> will have this extra bit of logic for callchain. All other program
> types with access to bpf_get_stack/bpf_get_stackid should use the
> current one, probably.
> 2. For that special program, just like for bpf_read_branch_records(),
> we know that context is actually `struct bpf_perf_event_data_kern *`,
> and it has pt_regs, perf_sample_data and perf_event itself.
> 3. With that, it seems like you'll have everything you need to
> automatically choose a proper callchain.
>
> All this absolutely transparently to the BPF program.
>
> Am I missing something?
Good idea! A separate prototype should work here.
Thanks,
Song
Powered by blists - more mailing lists