[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzYxcSCae=sF3EKNUtLDCZhkhHkd88CEBt4bffzN_AZrDw@mail.gmail.com>
Date: Thu, 17 Feb 2022 14:01:30 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Masami Hiramatsu <mhiramat@...nel.org>
Cc: Jiri Olsa <olsajiri@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Jiri Olsa <jolsa@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Network Development <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, lkml <linux-kernel@...r.kernel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...omium.org>,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH 0/8] bpf: Add fprobe link
On Thu, Feb 17, 2022 at 6:04 AM Masami Hiramatsu <mhiramat@...nel.org> wrote:
>
> On Wed, 16 Feb 2022 10:27:19 -0800
> Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:
>
> > On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <olsajiri@...il.com> wrote:
> > >
> > > On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > > > Hi Alexei,
> > > >
> > > > On Thu, 3 Feb 2022 18:42:22 -0800
> > > > Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:
> > > >
> > > > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@...dmis.org> wrote:
> > > > > >
> > > > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > > > Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:
> > > > > >
> > > > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > > > transparently.
> > > > > > >
> > > > > > > Not true.
> > > > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > > > There was an implicit optimization for kprobe when ftrace
> > > > > > > could be used.
> > > > > > > All this new interface is doing is making it explicit.
> > > > > > > So a new name is not warranted here.
> > > > > > >
> > > > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > > > >
> > > > > > > What is the difference?
> > > > > > > I don't see it.
> > > > > >
> > > > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > > > parameters?"
> > > > > >
> > > > > > You can only reliably get the parameters at function entry. Hence, by
> > > > > > having a probe that is unique to functions as supposed to the middle of a
> > > > > > function, makes sense to me.
> > > > > >
> > > > > > That is, the API can change. "Give me parameter X". That along with some
> > > > > > BTF reading, could figure out how to get parameter X, and record that.
> > > > >
> > > > > This is more or less a description of kprobe on ftrace :)
> > > > > The bpf+kprobe users were relying on that for a long time.
> > > > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > > > They're meaningful only with kprobe on ftrace.
> > > > > So, no, fprobe is not inventing anything new here.
> > > >
> > > > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > > > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > > > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > > > entry.
> > > >
> > > > > No one is using kprobe in the middle of the function.
> > > > > It's too difficult to make anything useful out of it,
> > > > > so no one bothers.
> > > > > When people say "kprobe" 99 out of 100 they mean
> > > > > kprobe on ftrace/fentry.
> > > >
> > > > I see. But the kprobe is kprobe. It is not designed to support multiple
> > > > probe points. If I'm forced to say, I can rename the struct fprobe to
> > > > struct multi_kprobe, but that doesn't change the essence. You may need
> > > > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > > > need to do that.)
> > >
> > > hi,
> > > tying to kick things further ;-) I was thinking about bpf side of this
> > > and we could use following interface:
> > >
> > > enum bpf_attach_type {
> > > ...
> > > BPF_TRACE_KPROBE_MULTI
> > > };
> > >
> > > enum bpf_link_type {
> > > ...
> > > BPF_LINK_TYPE_KPROBE_MULTI
> > > };
> > >
> > > union bpf_attr {
> > >
> > > struct {
> > > ...
> > > struct {
> > > __aligned_u64 syms;
> > > __aligned_u64 addrs;
> > > __aligned_u64 cookies;
> > > __u32 cnt;
> > > __u32 flags;
> > > } kprobe_multi;
> > > } link_create;
> > > }
> > >
> > > because from bpf user POV it's new link for attaching multiple kprobes
> > > and I agree new 'fprobe' type name in here brings more confusion, using
> > > kprobe_multi is straightforward
> > >
> > > thoguhts?
> >
> > I think this makes sense. We do need new type of link to store ip ->
> > cookie mapping anyways.
>
> This looks good to me too.
>
> >
> > Is there any chance to support this fast multi-attach for uprobe? If
> > yes, we might want to reuse the same link for both (so should we name
> > it more generically?
>
> There is no interface to do that but also there is no limitation to
> expand uprobes. For the kprobes, there are some limitations for the
> function entry because it needs to share the space with ftrace. So
> I introduced fprobe for easier to use.
>
> > on the other hand BPF program type for uprobe is
> > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > consistent with what we have today).
>
> Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> the target program.)
>
We've been talking about sleepable uprobe programs, so we might need
to add uprobe-specific program type, probably. But historically, from
BPF point of view there was no difference between kprobe and uprobe
programs (in terms of how they are run and what's available to them).
>From BPF point of view, it was just attaching BPF program to a
perf_event.
>
> > But yeah, the main question is whether there is something preventing
> > us from supporting multi-attach uprobe as well? It would be really
> > great for USDT use case.
>
> Ah, for the USDT, it will be useful. But since now we will have "user-event"
> which is faster than uprobes, we may be better to consider to use it.
Any pointers? I'm not sure what "user-event" refers to.
>
> I'm not so sure how uprobes probes the target process, but maybe it has
> to manage some memory pages and task related things. If we can split
> those task-related part from struct uprobe software-breakpoint part,
> it maybe easy to support multiple probe (one task-related part + multiple
> software-breakpoint parts.)
>
> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@...nel.org>
Powered by blists - more mailing lists