[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c9846f8a-1bc6-0fc2-c1e6-23f21401f278@iogearbox.net>
Date: Thu, 17 May 2018 17:32:28 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Yonghong Song <yhs@...com>, Peter Zijlstra <peterz@...radead.org>
Cc: ast@...com, netdev@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH bpf-next 2/7] bpf: introduce bpf subcommand
BPF_PERF_EVENT_QUERY
On 05/16/2018 11:59 PM, Yonghong Song wrote:
> On 5/16/18 4:27 AM, Peter Zijlstra wrote:
>> On Tue, May 15, 2018 at 04:45:16PM -0700, Yonghong Song wrote:
>>> Currently, suppose a userspace application has loaded a bpf program
>>> and attached it to a tracepoint/kprobe/uprobe, and a bpf
>>> introspection tool, e.g., bpftool, wants to show which bpf program
>>> is attached to which tracepoint/kprobe/uprobe. Such attachment
>>> information will be really useful to understand the overall bpf
>>> deployment in the system.
>>>
>>> There is a name field (16 bytes) for each program, which could
>>> be used to encode the attachment point. There are some drawbacks
>>> for this approaches. First, bpftool user (e.g., an admin) may not
>>> really understand the association between the name and the
>>> attachment point. Second, if one program is attached to multiple
>>> places, encoding a proper name which can imply all these
>>> attachments becomes difficult.
>>>
>>> This patch introduces a new bpf subcommand BPF_PERF_EVENT_QUERY.
>>> Given a pid and fd, if the <pid, fd> is associated with a
>>> tracepoint/kprobe/uprobea perf event, BPF_PERF_EVENT_QUERY will return
>>> . prog_id
>>> . tracepoint name, or
>>> . k[ret]probe funcname + offset or kernel addr, or
>>> . u[ret]probe filename + offset
>>> to the userspace.
>>> The user can use "bpftool prog" to find more information about
>>> bpf program itself with prog_id.
>>>
>>> Signed-off-by: Yonghong Song <yhs@...com>
>>> ---
>>> include/linux/trace_events.h | 15 ++++++
>>> include/uapi/linux/bpf.h | 25 ++++++++++
>>> kernel/bpf/syscall.c | 113 +++++++++++++++++++++++++++++++++++++++++++
>>> kernel/trace/bpf_trace.c | 53 ++++++++++++++++++++
>>> kernel/trace/trace_kprobe.c | 29 +++++++++++
>>> kernel/trace/trace_uprobe.c | 22 +++++++++
>>> 6 files changed, 257 insertions(+)
>>
>> Why is the command called *_PERF_EVENT_* ? Are there not a lot of !perf
>> places to attach BPF proglets?
>
> Just gave a complete picture, the below are major places to attach
> BPF programs:
> . perf based (through perf ioctl)
> . raw tracepoint based (through bpf interface)
>
> . netlink interface for tc, xdp, tunneling
> . setsockopt for socket filters
> . cgroup based (bpf attachment subcommand)
> mostly networking and io devices
> . some other networking socket related (sk_skb stream/parser/verdict,
> sk_msg verdict) through bpf attachment subcommand.
>
> Currently, for cgroup based attachment, we have BPF_PROG_QUERY with input cgroup file descriptor. For other networking based queries, we
> may need to enumerate tc filters, networking devices, open sockets, etc.
> to get the attachment information.
>
> So to have one BPF_QUERY command line may be too complex to
> cover all cases.
>
> But you are right that BPF_PERF_EVENT_QUERY name is too narrow since
> it should be used for other (pid, fd) based queries as well (e.g., socket, or other potential uses in the future).
>
> How about the subcommand name BPF_TASK_FD_QUERY and make bpf_attr.task_fd_query extensible?
I like the introspection output it provides in 7/7, it's really great!
So the query interface would only ever be tied to BPF progs whose attach
life time is tied to the life time of the application and as soon as all
refs on the fd are released it's unloaded from the system. BPF_TASK_FD_QUERY
seems okay to me, or something like BPF_ATTACH_QUERY. Even if the name is
slightly more generic, it might be more fitting with other cmds like
BPF_PROG_QUERY we have where we tell an attach point to retrieve all progs
from it (though only tied to cgroups right now, it may not be in future).
For all the others that are not strictly tied to the task but global, bpftool
would then need to be extended to query the various other interfaces like
netlink for retrieval which is on todo for some point in future as well. So
this set nicely complements this introspection aspect.
Thanks,
Daniel
Powered by blists - more mailing lists