[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aaee39fc-85b1-370b-c334-8892eff5c43c@fb.com>
Date: Thu, 17 May 2018 10:50:38 -0700
From: Yonghong Song <yhs@...com>
To: Daniel Borkmann <daniel@...earbox.net>,
Peter Zijlstra <peterz@...radead.org>
CC: <ast@...com>, <netdev@...r.kernel.org>, <kernel-team@...com>
Subject: Re: [PATCH bpf-next 2/7] bpf: introduce bpf subcommand
BPF_PERF_EVENT_QUERY
On 5/17/18 8:32 AM, Daniel Borkmann wrote:
> On 05/16/2018 11:59 PM, Yonghong Song wrote:
>> On 5/16/18 4:27 AM, Peter Zijlstra wrote:
>>> On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote:
>>>> Currently, suppose a userspace application has loaded a bpf program
>>>> and attached it to a tracepoint/kprobe/uprobe, and a bpf
>>>> introspection tool, e.g., bpftool, wants to show which bpf program
>>>> is attached to which tracepoint/kprobe/uprobe. Such attachment
>>>> information will be really useful to understand the overall bpf
>>>> deployment in the system.
>>>>
>>>> There is a name field (16 bytes) for each program, which could
>>>> be used to encode the attachment point. There are some drawbacks
>>>> for this approaches. First, bpftool user (e.g., an admin) may not
>>>> really understand the association between the name and the
>>>> attachment point. Second, if one program is attached to multiple
>>>> places, encoding a proper name which can imply all these
>>>> attachments becomes difficult.
>>>>
>>>> This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY.
>>>> Given a pid and fd, if the <pid, fd> is associated with a
>>>> tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return
>>>> . prog_id
>>>> . tracepoint name, or
>>>> . k[ret]probe funcname + offset or kernel addr, or
>>>> . u[ret]probe filename + offset
>>>> to the userspace.
>>>> The user can use "bpftool prog" to find more information about
>>>> bpf program itself with prog_id.
>>>>
>>>> Signed-off-by: Yonghong Song <yhs@...com>
>>>> ---
>>>> include/linux/trace_events.h | 15 ++++++
>>>> include/uapi/linux/bpf.h | 25 ++++++++++
>>>> kernel/bpf/syscall.c | 113 +++++++++++++++++++++++++++++++++++++++++++
>>>> kernel/trace/bpf_trace.c | 53 ++++++++++++++++++++
>>>> kernel/trace/trace_kprobe.c | 29 +++++++++++
>>>> kernel/trace/trace_uprobe.c | 22 +++++++++
>>>> 6 files changed, 257 insertions(+)
>>>
>>> Why is the command called *_PERF_EVENT_* ? Are there not a lot of !perf
>>> places to attach BPF proglets?
>>
>> Just gave a complete picture, the below are major places to attach
>> BPF programs:
>> . perf based (through perf ioctl)
>> . raw tracepoint based (through bpf interface)
>>
>> . netlink interface for tc, xdp, tunneling
>> . setsockopt for socket filters
>> . cgroup based (bpf attachment subcommand)
>> mostly networking and io devices
>> . some other networking socket related (sk_skb stream/parser/verdict,
>> sk_msg verdict) through bpf attachment subcommand.
>>
>> Currently, for cgroup based attachment, we have BPF_PROG_QUERY with input cgroup file descriptor. For other networking based queries, we
>> may need to enumerate tc filters, networking devices, open sockets, etc.
>> to get the attachment information.
>>
>> So to have one BPF_QUERY command line may be too complex to
>> cover all cases.
>>
>> But you are right that BPF_PERF_EVENT_QUERY name is too narrow since
>> it should be used for other (pid, fd) based queries as well (e.g., socket, or other potential uses in the future).
>>
>> How about the subcommand name BPF_TASK_FD_QUERY and make bpf_attr.task_fd_query extensible?
>
> I like the introspection output it provides in 7/7, it's really great!
> So the query interface would only ever be tied to BPF progs whose attach
> life time is tied to the life time of the application and as soon as all
> refs on the fd are released it's unloaded from the system. BPF_TASK_FD_QUERY
> seems okay to me, or something like BPF_ATTACH_QUERY. Even if the name is
> slightly more generic, it might be more fitting with other cmds like
> BPF_PROG_QUERY we have where we tell an attach point to retrieve all progs
> from it (though only tied to cgroups right now, it may not be in future).
I think BPF_TASK_FD_QUERY is okay. Using BPF_ATTACH_QUERY indeed seems
a little bit broader to me as other query subcommands are possible to
query attachments with different input.
BPF_PROG_QUERY is also trying to query attachment. Currently, given a
cgroup fd, it will query prog array attached. Sean has the patch to
attach bpf programs to a RC device, and given a device fd, it will
query prog array attached to that device.
>
> For all the others that are not strictly tied to the task but global, bpftool
> would then need to be extended to query the various other interfaces like
> netlink for retrieval which is on todo for some point in future as well. So
> this set nicely complements this introspection aspect.
Totally agree.
Thanks!
>
> Thanks,
> Daniel
>
Powered by blists - more mailing lists