netdev - Re: [PATCH bpf-next v2 02/13] bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181106215633.gi6v4fi2llmqo2rn@ast-mbp>
Date:   Tue, 6 Nov 2018 13:56:34 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Edward Cree <ecree@...arflare.com>
Cc:     Martin Lau <kafai@...com>, Yonghong Song <yhs@...com>,
        Alexei Starovoitov <ast@...com>,
        "daniel@...earbox.net" <daniel@...earbox.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next v2 02/13] bpf: btf: Add BTF_KIND_FUNC and
 BTF_KIND_FUNC_PROTO

On Tue, Nov 06, 2018 at 06:52:11PM +0000, Edward Cree wrote:
> On 06/11/18 06:29, Alexei Starovoitov wrote:
> > BTF is not pure type information. BTF is everything that verifier needs
> > to know to make safety decisions that bpf instruction set doesn't have.
> Yes, I'm not disputing that and never have.
> I'm just saying that it will be much cleaner and better if it's
>  internally organised differently.
> 
> > Splitting pure types into one section, variables into another,
> > functions into yet another is not practical, since the same
> > modifiers (like const or volatile) need to be applied to
> > variables and functions. At the end all sections will have
> > the same style of encoding, hence no need to duplicate
> > the encoding three times and instead it's cleaner to encode
> > all of them BTF-style via different KINDs.
> This shows that you've misunderstood what I'm proposing, probably
>  I explained it poorly so I'll try again.
> I'm not suggesting that the 'functions' and 'variables' sections
>  would have _type_ records in them, only that they would reference
>  records in the 'types' section.  So if for instance we have
>     int foo(int x) { int quux; /* ... */ }
>     int bar(int y) { /* ... */ }
>  in the source, then in the 'types' section we would have
> 1 INT 32bits encoding=signed offset=0
> 2 FUNC args=(name="x" type=1,), ret=1
> 3 FUNC args=(name="y" type=1,), ret=1
>  while in the 'functions' section we would have
> 1 "foo" type=2 start_insn_idx=23 insn_count=19 (... maybe other info too...)
> 2 "bar" type=3 start_insn_idx=42 insn_count=5

that looks very weird to me. Why split func name from argument names?
The 'function name' as seen by the BTF may not be the symbol name
as seen in elf file.
Like C++ mangles names for elf. If/when we have C++ front-end for BPF
the BTF func name will be 'class::method' string.
While elf symbol name will still be mangled string in the elf file.

btw libbpf processes that elf symbol name for bpf prog already
and passes it to the kernel as bpf_attr.prog_name.
BTF's func name should be the one seen in the source.
Whatever that source code might be.
There are C, bpftrace, p4 and python frontends. These languages
should be free to put into BTF KIND_FUNC name that makes sense
from the language point of view.
Ideally it's C-like name, so when bpftool prints that BTF
it would be meaningful.

btw we've been thinking to teach libbpf to generate BTF KIND_FUNC
on the fly based on elf symbol name (when real BTF is missing in the elf),
but decided not to go that route for now.

>  and in the 'variables' section we might have
> 1 "quux" type=1 where=stack func=1 offset=-8

that doesn't work. stack slots can be reused by compiler.
variable can be in the register too.
right now we're not planning to have an equivalent of such dwarf
debug info in BTF, since -O2 is mandatory and with optimizations
variable tracking (gcc term) is not effective.
Instead we will annotate every load/store with btf type id.
Meaning no plans to tackle local variables.

The global variables for given .c file will look like single KIND_STRUCT
(which is variable length)
and when libbpf will learn to 'link' multiple .o the BTF deduplication
work (which we're doing for vmlinux) will apply as-is to
combine multiple .o together.

> 
> Thus the graph of types lives entirely in the 'types' section, but
>  things-that-are-not-types don't.  I'm not making a distinction
>  between "pure types" and (somehow) impure types; I'm making a
>  distinction between types (with all their impurities) and
>  *instances* of those types.
> Note that these 'sections' may all really be regions of the '.BTF'
>  ELF section, if that makes the implementation easier.  Also, the
>  'functions' and 'variables' sections _won't_ have the same style
>  of encoding as the 'types', because they're storing entirely
>  different data and in fact don't need variable record sizes.

yes we do see these things differently.
To us function name is the debug info that fits well into BTF description.
Whereas you see the function name part of function declaration
as something 'entirely different'. I don't quite get that point.
In C elf symbol name and in-source func name are the same,
which is probably causing this terminology confusion.