netdev - Re: [PATCH bpf-next v2 02/13] bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6c1b7b7a-f54d-0c77-f04c-f46f2a15d2f7@fb.com>
Date:   Fri, 9 Nov 2018 01:26:03 +0000
From:   Yonghong Song <yhs@...com>
To:     Edward Cree <ecree@...arflare.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:     Martin Lau <kafai@...com>, Alexei Starovoitov <ast@...com>,
        "Daniel Borkmann" <daniel@...earbox.net>,
        Network Development <netdev@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next v2 02/13] bpf: btf: Add BTF_KIND_FUNC and
 BTF_KIND_FUNC_PROTO



On 11/8/18 2:56 PM, Edward Cree wrote:
> On 08/11/18 19:42, Alexei Starovoitov wrote:
>> same link let's continue at 1pm PST.
> So, one thing we didn't really get onto was maps, and you mentioned that it
>   wasn't really clear what I was proposing there.
> What I have in mind comes in two parts:
> 1) map type.  A new BTF_KIND_MAP with metadata 'key_type', 'value_type'
>   (both are type_ids referencing other BTF type records), describing the
>   type "map from key_type to value_type".
> 2) record in the 'instances' table.  This would have a name_off (the
>   name of the map), a type_id (pointing at a BTF_KIND_MAP in the 'types'
>   table), and potentially also some indication of what symbol (from
>   section 'maps') refers to this map.  This is pretty much the exact
>   same metadata that a function in the 'instances' table has, the only
>   differences being
>   (a) function's type_id points at a BTF_KIND_FUNC record
>   (b) function's symbol indication refers from .text section
>   (c) in future functions may be nested inside other functions, whereas
>   AIUI a map can't live inside a function.  (But a variable, which is
>   the other thing that would want to go in an 'instances' table, can.)
> So the 'instances' table record structure looks like
> 
> struct btf_instance {
>      __u32 type_id; /* Type of object declared.  An index into type section */
>      __u32 name_off; /* Name of object.  An offset into string section */
>      __u32 parent; /* Containing object if any (else 0).  An index into instance section */
> };
> 
> and we extend the BTF header:
> 
> struct btf_header {
>      __u16   magic;
>      __u8    version;
>      __u8    flags;
>      __u32   hdr_len;
> 
>      /* All offsets are in bytes relative to the end of this header */
>      __u32   type_off;      /* offset of type section       */
>      __u32   type_len;      /* length of type section       */
>      __u32   str_off;       /* offset of string section     */
>      __u32   str_len;       /* length of string section     */
>      __u32   inst_off;      /* offset of instance section   */
>      __u32   inst_len;      /* length of instance section   */
> };
> 
> Then in the .BTF.ext section, we have both
> 
> struct bpf_func_info {
>      __u32 prog_symbol; /* Index of symbol giving address of subprog */
>      __u32 inst_id; /* Index into instance section */
> }
> 
> struct bpf_map_info {
> {
>      __u32 map_symbol; /* Index of symbol creating this map */
>      __u32 inst_id; /* Index into instance section */
> }
> 
> (either living in different subsections, or in a single table with
>   the addition of a kind field, or in a single table relying on the
>   ultimately referenced type to distinguish funcs from maps).
> 
> Note that the name (in btf_instance) of a map or function need not
>   match the name of the corresponding symbol; we use the .BTF.ext
>   section to tie together btf_instance IDs and symbol IDs.  Then in
>   the case of functions (subprogs), the prog_symbol can be looked
>   up in the ELF symbol table to find the address (== insn_offset)
>   of the subprog, as well as the section containing it (since that
>   might not be .text).  Similarly in the case of maps the BTF info
>   about the map is connected with the info in the maps section.
> 
> Now when the loader has munged this, what it passes to the kernel
>   might not have map_symbol, but instead map_fd.  Instead of
>   prog_symbol it will have whatever identifies the subprog in the
>   blob of stuff it feeds to the kernel (so probably insn_offset).
> 
> All this would of course require a bit more compiler support than
>   the current BPF_ANNOTATE_KV_PAIR, since that just causes the
>   existing BTF machinery to declare a specially constructed struct
>   type.  At the C level you could still have BPF_ANNOTATE_KV_PAIR
>   and the '____bpf_map_foo' name, but then the compiler would
>   recognise that and convert it into an instance record by looking
>   up the name 'foo' in its "maps" section.  That way the special
>   ____bpf_map_* handling (which ties map names to symbol names,

Compiler in general does not do transformation based on variable
or struct type names by default, so this probably should stay
in the loader.

>   also) would be entirely compiler-internal and not 'leak out' into
>   the definition of the format.  Frontends for other languages
>   which do possess a native map type (e.g. Python dict) might have
>   other ways of indicating the key/value type of a map at source
>   level (e.g. PEP 484) and could directly generate the appropriate
>   BTF_KIND_MAP and bpf_map_info records rather than (as they would
>   with the current design) having to encode the information as a
>   struct ____bpf_map_foo type-definition.

You mean a python application can generate bpf byte codes and
BTFs (include map types)? That will be different from the C/LLVM
use case. The python app. probably will be the loader as well.

One option is to pass BPF specific flag like 
"-map-type-prefix="___bpf_map_" and LLVM will generate BTF_KIND_MAP type
for any structure with name "___bpf_map_<...>". But if this is
the case, user can just search the type table for struct name
"___bpf_map_<...>" and llvm does not need to do anything. Note that
once user passes "-map-type-prefix="___bpf_map_" to llvm, the
definition of the format is already leaked.

So I feel that this probably belongs to the loader.

> 
> 
> While I realise the desire to concentrate on one topic at once, I
>   think this question of maps should be discussed in tomorrow's
>   call, since it is when we start having other kinds of instances
>   besides functions that the advantages of my design become
>   apparent, unifying the process of 'declaration' of functions,
>   maps, and (eventually) variables while separating them all from
>   the process of 'definition' of the types of all three.
> 
> Thank you for your continued patience with me.
> -Ed
>