netdev - Re: [PATCH bpf-next 0/2] libbpf: Implement BTF Generator API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGqxgpuB_L519RK6mGUrt9XTHnYJTrZY9AuQqgQ+p196k+oE1g@mail.gmail.com>
Date:   Fri, 29 Oct 2021 02:41:42 -0300
From:   Rafael David Tinoco <rafaeldtinoco@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Mauricio Vásquez <mauricio@...volk.io>,
        Network Development <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>
Subject: Re: [PATCH bpf-next 0/2] libbpf: Implement BTF Generator API

On Thu, Oct 28, 2021 at 11:34 PM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Wed, Oct 27, 2021 at 1:37 PM Mauricio Vásquez <mauricio@...volk.io> wrote:
> > There is also a good example[3] on how to use BTFGen and BTFHub together
> > to generate multiple BTF files, to each existing/supported kernel,
> > tailored to one application. For example: a complex bpf object might
> > support nearly 400 kernels by having BTF files summing only 1.5 MB.
>
> Could you share more details on what kind of fields and types
> were used to achieve this compression?
> Tracing progs will be peeking into task_struct.
> To describe it in the reduced BTF most of the kernel types would be needed,
> so I'm a bit skeptical on the practicality of the algorithm.

https://github.com/aquasecurity/btfhub/tree/main/tools

has a complete README and, at the end, the example used:

https://github.com/aquasecurity/btfhub/tree/main/tools#time-to-test-btfgen-and-btfhub

We tested btfgen with bpfcc tools and tracee:

https://github.com/aquasecurity/tracee/blob/main/tracee-ebpf/tracee/tracee.bpf.c

and the generated BTF files worked. If you run something like:

./btfgen.sh [.../aquasec-tracee/tracee-ebpf/dist/tracee.bpf.core.o]

it will generate the BTFs tailored to a given eBPF object file, 1 smaller BTF
file per existing full external raw BTF file (1 per kernel version, basically).

All the ~500 kernels generated the same amount of BTF files with ~3MB in
total. We then remove all the BTF files that are equal to their previous
kernels:

https://github.com/aquasecurity/btfhub/blob/main/tools/btfgen.sh#L113

and we are able to reduce from 3MB to 1.5MB (as similar BTF files are symlinks
to the previous ones).

> I think it may work for sk_buff, since it will pull struct sock,
> net_device, rb_tree, and not a ton more.
> Have you considered generating kernel BTF with fields that are accessed
> by bpf prog only and replacing all other fields with padding ?

That is exactly the result of our final BTF file. We only include the
fields and types being used by the given eBPF object:

```
$ bpftool btf dump file ./generated/5.4.0-87-generic.btf format raw
[1] PTR '(anon)' type_id=99
[2] TYPEDEF 'u32' type_id=35
[3] TYPEDEF '__be16' type_id=22
[4] PTR '(anon)' type_id=52
[5] TYPEDEF '__u8' type_id=83
[6] PTR '(anon)' type_id=29
[7] STRUCT 'mnt_namespace' size=120 vlen=1
    'ns' type_id=72 bits_offset=64
[8] TYPEDEF '__kernel_gid32_t' type_id=75
[9] STRUCT 'iovec' size=16 vlen=2
    'iov_base' type_id=16 bits_offset=0
    'iov_len' type_id=85 bits_offset=64
[10] PTR '(anon)' type_id=58
[11] STRUCT '(anon)' size=8 vlen=2
    'skc_daddr' type_id=81 bits_offset=0
    'skc_rcv_saddr' type_id=81 bits_offset=32
[12] TYPEDEF '__u64' type_id=89
...
[120] STRUCT 'task_struct' size=9216 vlen=13
    'thread_info' type_id=105 bits_offset=0
    'real_parent' type_id=30 bits_offset=18048
    'real_cred' type_id=16 bits_offset=21248
    'pid' type_id=14 bits_offset=17920
    'mm' type_id=110 bits_offset=16512
    'thread_pid' type_id=56 bits_offset=18752
    'exit_code' type_id=123 bits_offset=17152
    'group_leader' type_id=30 bits_offset=18432
    'flags' type_id=75 bits_offset=288
    'thread_group' type_id=87 bits_offset=19328
    'tgid' type_id=14 bits_offset=17952
    'nsproxy' type_id=100 bits_offset=22080
    'comm' type_id=96 bits_offset=21440
[121] STRUCT 'pid' size=96 vlen=1
    'numbers' type_id=77 bits_offset=640
[122] STRUCT 'new_utsname' size=390 vlen=1
    'nodename' type_id=48 bits_offset=520
[123] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[124] ARRAY '(anon)' type_id=35 index_type_id=123 nr_elems=2
```

If you do a "format c" to the generated BTF file, then bpftool considers
everything as padding:

```
typedef unsigned char __u8;

struct ns_common {
        long: 64;
        long: 64;
        unsigned int inum;
        int: 32;
};

struct mnt_namespace {
        long: 64;
        struct ns_common ns;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
};

typedef unsigned int __kernel_gid32_t;
```

But libbpf is still able to calculate all field relocations.

> I think the algo would be quite different from the actual CO-RE logic
> you're trying to reuse.
> If CO-RE matching style is necessary and it's the best approach then please
> add new logic to bpftool.

Yes, we're heading that direction I suppose :\ ... Accessing .BTF.ext, to get
the "bpf_core_relo" information requires libbpf internals to be exposed:

https://github.com/rafaeldtinoco/btfgen/blob/standalone/btfgen2.c#L119
https://github.com/rafaeldtinoco/btfgen/blob/standalone/include/stolen.h

we would have to check if we can try to export what we need for that (instead
of re-declaring internal headers, which obviously looks bad).