netdev - Re: [PATCH bpf-next 4/6] bpf, libbpf: add bpf_tail_call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5f3850b2-7346-02d7-50f5-f63355115f35@iogearbox.net>
Date:   Fri, 25 Sep 2020 00:17:36 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        john fastabend <john.fastabend@...il.com>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next 4/6] bpf, libbpf: add bpf_tail_call_static helper
 for bpf programs

On 9/24/20 10:53 PM, Andrii Nakryiko wrote:
> On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@...earbox.net> wrote:
>>
>> Port of tail_call_static() helper function from Cilium's BPF code base [0]
>> to libbpf, so others can easily consume it as well. We've been using this
>> in production code for some time now. The main idea is that we guarantee
>> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
>> JITed BPF insns with direct jumps instead of having to fall back to using
>> expensive retpolines. By using inline asm, we guarantee that the compiler
>> won't merge the call from different paths with potentially different
>> content of r2/r3.
>>
>> We're also using __throw_build_bug() macro in different places as a neat
>> trick to trigger compilation errors when compiler does not remove code at
>> compilation time. This works for the BPF backend as it does not implement
>> the __builtin_trap().
>>
>>    [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
>>
>> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
>> ---
>>   tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>>   1 file changed, 32 insertions(+)
>>
>> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
>> index 1106777df00b..18b75a4c82e6 100644
>> --- a/tools/lib/bpf/bpf_helpers.h
>> +++ b/tools/lib/bpf/bpf_helpers.h
>> @@ -53,6 +53,38 @@
>>          })
>>   #endif
>>
>> +/*
>> + * Misc useful helper macros
>> + */
>> +#ifndef __throw_build_bug
>> +# define __throw_build_bug()   __builtin_trap()
>> +#endif
> 
> this will become part of libbpf stable API, do we want/need to expose
> it? If we want to expose it, then we should probably provide a better
> description.
> 
> But also curious, how is it better than _Static_assert() (see
> test_cls_redirect.c), which also allows to provide a better error
> message?

Need to get back to you whether that has same semantics. We use the __throw_build_bug()
also in __bpf_memzero() and friends [0] as a way to trigger a hard build bug if we hit
a default switch-case [0], so we detect unsupported sizes which are not covered by the
implementation yet. If _Static_assert (0, "foo") does the trick, we could also use that;
will check with our code base.

   [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h

>> +static __always_inline void
>> +bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
>> +{
>> +       if (!__builtin_constant_p(slot))
>> +               __throw_build_bug();
>> +
>> +       /*
>> +        * Don't gamble, but _guarantee_ that LLVM won't optimize setting
>> +        * r2 and r3 from different paths ending up at the same call insn as
>> +        * otherwise we won't be able to use the jmpq/nopl retpoline-free
>> +        * patching by the x86-64 JIT in the kernel.
>> +        *
> 
> So the clobbering comment below is completely clear. But this one is
> less clear without some sort of example situation in which bad things
> happen. Do you mind providing some pseudo-C example in which the
> compiler will optimize things in such a way that the tail call
> patching won't happen?

The details are pretty much here [1] and mentioned at plumbers, so if we end up from
different paths with different map or const key at the same tail-call call insn, then
the record_func_key() will determine that we need to emit retpoline instead of patching
which could have occured with two tail-call call insns, for example. This helper is just
to guarantee that the latter will always happen.

   [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d2e4c1e6c2947269346054ac8937ccfe9e0bcc6b

>> +        * Note on clobber list: we need to stay in-line with BPF calling
>> +        * convention, so even if we don't end up using r0, r4, r5, we need
>> +        * to mark them as clobber so that LLVM doesn't end up using them
>> +        * before / after the call.
>> +        */
>> +       asm volatile("r1 = %[ctx]\n\t"
>> +                    "r2 = %[map]\n\t"
>> +                    "r3 = %[slot]\n\t"
>> +                    "call 12\n\t"
>> +                    :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
>> +                    : "r0", "r1", "r2", "r3", "r4", "r5");
>> +}
>> +
>>   /*
>>    * Helper structure used by eBPF C program
>>    * to describe BPF map attributes to libbpf loader
>> --
>> 2.21.0
>>