netdev - Re: static and dynamic linking. Was: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZesSqGVf4FYodfuQz3VnWiyJe4CdN4=sVuYtt891kh5w@mail.gmail.com>
Date:   Wed, 13 Nov 2019 10:51:07 -0800
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Edward Cree <ecree@...arflare.com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Toke Høiland-Jørgensen <toke@...hat.com>,
        John Fastabend <john.fastabend@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        Marek Majkowski <marek@...udflare.com>,
        Lorenz Bauer <lmb@...udflare.com>,
        Alan Maguire <alan.maguire@...cle.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        David Miller <davem@...emloft.net>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: static and dynamic linking. Was: [PATCH bpf-next v3 1/5] bpf:
 Support chain calling multiple BPF

On Wed, Nov 13, 2019 at 10:30 AM Edward Cree <ecree@...arflare.com> wrote:
>
> On 12/11/2019 23:18, Alexei Starovoitov wrote:
> > On Tue, Nov 12, 2019 at 09:25:06PM +0000, Edward Cree wrote:
> >> Fwiw the 'natural' C way of doing it would be that for any extern symbol in
> >>  the C file, the ELF file gets a symbol entry with st_shndx=SHN_UNDEF, and
> >>  code in .text that uses that symbol gets relocation entries.  That's (AIUI)
> >>  how it works on 'normal' architectures, and that's what my ebld linker
> >>  understands; when it sees a definition in another file for that symbol
> >>  (matched just by the symbol name) it applies all the relocations of the
> >>  symbol to the appropriate progbits.
> >> I don't really see what else you could define 'extern' to mean.
> > That's exactly the problem with standard 'extern'. ELF preserves the name only.
> > There is no type.
> But if you have BTFs, then you can look up each symbol's type at link time and
>  check that they match.  Point being that the BTF is for validation (and

There is no BTF for extern right now, though. Not even expected size
(for extern variables), it's always zero. So until we have BTF emitted
with type information for externs (either variable or funcs), there is
not "expected type", it's just a name of a symbol.

>  therefore strictly optional at this point) rather than using it to identify a
>  symbol.  Trouble with the latter is that BTF ids get renumbered on linking, so
>  any references to them have to change, whereas symbol names stay the same.
>
> > There is also
> > no way to place extern into a section. Currently SEC("..") is a standard way to
> > annotate bpf programs.
> While the symbol itself doesn't have a section, each _use_ of the symbol has a
>  reloc, and the SHT_REL[A] in which that reloc resides has a sh_info specifying
>  "the section header index of the section to which the relocation applies."  So
>  can't that be used if symbol visibility needs to depend on section?  Tbh I
>  can't exactly see why externs need placing in a section in the first place.

That reloc section is a section in which the **code** that uses extern
is located. It's always going to be program or sub-program section, so
not that useful.

With section names for externs, we can use them as a namespace of
sorts, specifying whether the symbol has to come from kernel itself,
some named BPF object (where name comes from section name as well), or
from libbpf itself. This can be specified if symbol name is ambiguous
and defined in multiple places. Or always just for explicitness.

>
> > I think reliable 'extern' has to have more than just
> > name. 'extern int foo;' can be a reference to 'int foo;' in another BPF ELF
> > file, or it can be a reference to 'int foo;' in already loaded BPF prog, or it
> > can be a reference to 'int foo;' inside the kernel itself, or it can be a
> > reference to pseudo variable that libbpf should replace. For example 'extern
> > int kernel_version;' or 'extern int CONFIG_HZ;' would be useful extern-like
> > variables that program might want to use. Disambiguating by name is probably
> > not enough. We can define an order of resolution. libbpf will search in other
> > .o first, then will search in loaded bpf progs, than in kernel, and if all
> > fails than will resolve things like 'extern int CONFIG_HZ' on its own. It feels
> > fragile though.
> It sounds perfectly reasonable and not fragile to me.  The main alternative
>  I see, about equally good, is to not allow defining symbols that are already
>  (non-weakly) defined; so if a bpf prog tries to globally declare "int CONFIG_HZ"
>  or "int netif_receive_skb(struct sk_buff *skb)" then it gets rejected.
>
> > I think we need to be able to specify something like section to
> > extern variables and functions.
> It seems unnecessary to have the user code specify this.  Another a bad
>  analogy: in userland C code you don't have to annotate the function protos in
>  your header files to say whether they come from another .o file, a random
>  library or the libc.  You just declare "a function called this exists somewhere
>  and we'll find it at link time".

With userspace linking, you control which libraries you are trying to
resolve against. Here it might be against any loaded BPF object in a
system. One way around this would be to specify object names to check
(or kernel) programmatically to libbpf on open/load, but it would be
good to be able to do that declaratively as well, IMO.

>
> > I was imagining that the verifier will do per-function verification
> > of program with sub-programs instead of analyzing from root.
> Ah I see.  Yes, that's a very attractive design.
>
> If we make it from a sufficiently generic idea of pre/postconditions, then it
>  could also be useful for e.g. loop bodies (user-supplied annotations that allow
>  us to walk the body only once instead of N times); then a function call just
>  gets standard pre/postconditions generated from its argument types if the user
>  didn't specify something else.
>
> That would then also support things like:
> > The next step is to extend this thought process to integers.
> > int foo(struct xdp_md *arg1, int arg2);
> > The verifier can check that the program is valid for any valid arg1 and
> > arg2 = mark_reg_unbounded().
> ... this but arg2 isn't unbounded.
> However, it might be difficult to do this without exposing details of the
>  verifier into the ABI.  Still, if we can it sounds like it would make John
>  quite happy too.  And of course it doesn't need to have the user annotations
>  from the beginning, it can start out as just the kernel generating pre/post
>  conditions internally.
>
> -Ed