linux-kernel - Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQJoH4WOQ0t7ZhLgh4kh2obxkFs0UGDRas0y4QSqh1EMsg@mail.gmail.com>
Date:   Mon, 17 Jun 2019 18:32:22 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Kris Van Hees <kris.van.hees@...cle.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, dtrace-devel@....oracle.com,
        LKML <linux-kernel@...r.kernel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type
 implementation and sample use

On Mon, Jun 17, 2019 at 6:25 PM Kris Van Hees <kris.van.hees@...cle.com> wrote:
>
> On Thu, May 23, 2019 at 01:28:44PM -0700, Alexei Starovoitov wrote:
>
> << stuff skipped because it is not relevant to the technical discussion... >>
>
> > > > In particular you brought up a good point that there is a use case
> > > > for sharing a piece of bpf program between kprobe and tracepoint events.
> > > > The better way to do that is via bpf2bpf call.
> > > > Example:
> > > > void bpf_subprog(arbitrary args)
> > > > {
> > > > }
> > > >
> > > > SEC("kprobe/__set_task_comm")
> > > > int bpf_prog_kprobe(struct pt_regs *ctx)
> > > > {
> > > >   bpf_subprog(...);
> > > > }
> > > >
> > > > SEC("tracepoint/sched/sched_switch")
> > > > int bpf_prog_tracepoint(struct sched_switch_args *ctx)
> > > > {
> > > >   bpf_subprog(...);
> > > > }
> > > >
> > > > Such configuration is not supported by the verifier yet.
> > > > We've been discussing it for some time, but no work has started,
> > > > since there was no concrete use case.
> > > > If you can work on adding support for it everyone will benefit.
> > > >
> > > > Could you please consider doing that as a step forward?
> > >
> > > This definitely looks to be an interesting addition and I am happy to look into
> > > that further.  I have a few questions that I hope you can shed light on...
> > >
> > > 1. What context would bpf_subprog execute with?  If it can be called from
> > >    multiple different prog types, would it see whichever context the caller
> > >    is executing with?  Or would you envision bpf_subprog to not be allowed to
> > >    access the execution context because it cannot know which one is in use?
> >
> > bpf_subprog() won't be able to access 'ctx' pointer _if_ it's ambiguous.
> > The verifier already smart enough to track all the data flow, so it's fine to
> > pass 'struct pt_regs *ctx' as long as it's accessed safely.
> > For example:
> > void bpf_subprog(int kind, struct pt_regs *ctx1, struct sched_switch_args *ctx2)
> > {
> >   if (kind == 1)
> >      bpf_printk("%d", ctx1->pc);
> >   if (kind == 2)
> >      bpf_printk("%d", ctx2->next_pid);
> > }
> >
> > SEC("kprobe/__set_task_comm")
> > int bpf_prog_kprobe(struct pt_regs *ctx)
> > {
> >   bpf_subprog(1, ctx, NULL);
> > }
> >
> > SEC("tracepoint/sched/sched_switch")
> > int bpf_prog_tracepoint(struct sched_switch_args *ctx)
> > {
> >   bpf_subprog(2, NULL, ctx);
> > }
> >
> > The verifier should be able to prove that the above is correct.
> > It can do so already if s/ctx1/map_value1/, s/ctx2/map_value2/
> > What's missing is an ability to have more than one 'starting' or 'root caller'
> > program.
> >
> > Now replace SEC("tracepoint/sched/sched_switch") with SEC("cgroup/ingress")
> > and it's becoming clear that BPF_PROG_TYPE_PROBE approach is not good enough, right?
> > Folks are already sharing the bpf progs between kprobe and networking.
> > Currently it's done via code duplication and actual sharing happens via maps.
> > That's not ideal, hence we've been discussing 'shared library' approach for
> > quite some time. We need a way to support common bpf functions that can be called
> > from networking and from tracing programs.
> >
> > > 2. Given that BPF programs are loaded with a specification of the prog type,
> > >    how would one load a code construct as the one you outline above?  How can
> > >    you load a BPF function and have it be used as subprog from programs that
> > >    are loaded separately?  I.e. in the sample above, if bpf_subprog is loaded
> > >    as part of loading bpf_prog_kprobe (prog type KPROBE), how can it be
> > >    referenced from bpf_prog_tracepoint (prog type TRACEPOINT) which would be
> > >    loaded separately?
> >
> > The api to support shared libraries was discussed, but not yet implemented.
> > We've discussed 'FD + name' approach.
> > FD identifies a loaded program (which is root program + a set of subprogs)
> > and other programs can be loaded at any time later. The BPF_CALL instructions
> > in such later program would refer to older subprogs via FD + name.
> > Note that both tracing and networking progs can be part of single elf file.
> > libbpf has to be smart to load progs into kernel step by step
> > and reusing subprogs that are already loaded.
> >
> > Note that libbpf work for such feature can begin _without_ kernel changes.
> > libbpf can pass bpf_prog_kprobe+bpf_subprog as a single program first,
> > then pass bpf_prog_tracepoint+bpf_subprog second (as a separate program).
> > The bpf_subprog will be duplicated and JITed twice, but sharing will happen
> > because data structures (maps, global and static data) will be shared.
> > This way the support for 'pseudo shared libraries' can begin.
> > (later accompanied by FD+name kernel support)
>
> As far as I can determine, the current libbpd implementation is already able
> to do the duplication of the called function, even when the ELF object contains
> programs of differemt program types.  I.e. the example you give at the top
> of the email actually seems to work already.  Right?

Have you tried it?

> In that case, I am a bit unsure what more can be done on the side of libbpf
> without needing kernel changes?

it's a bit weird to discuss hypothetical kernel changes when the first step
of changing libbpf wasn't even attempted.