netdev - Re: [RFC PATCH bpf-next 2/4] bpf: introduce BPF dispatcher

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191115003024.h7eg2kbve23jmzqn@ast-mbp.dhcp.thefacebook.com>
Date:   Thu, 14 Nov 2019 16:30:26 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Björn Töpel <bjorn.topel@...il.com>
Cc:     netdev@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
        Björn Töpel <bjorn.topel@...el.com>,
        bpf@...r.kernel.org, magnus.karlsson@...il.com,
        magnus.karlsson@...el.com, jonathan.lemon@...il.com
Subject: Re: [RFC PATCH bpf-next 2/4] bpf: introduce BPF dispatcher

On Wed, Nov 13, 2019 at 09:47:35PM +0100, Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@...el.com>
> 
> The BPF dispatcher builds on top of the BPF trampoline ideas;
> Introduce bpf_arch_text_poke() and (re-)use the BPF JIT generate
> code. The dispatcher builds a dispatch table for XDP programs, for
> retpoline avoidance. The table is a simple binary search model, so
> lookup is O(log n). Here, the dispatch table is limited to four
> entries (for laziness reason -- only 1B relative jumps :-P). If the
> dispatch table is full, it will fallback to the retpoline path.
> 
> An example: A module/driver allocates a dispatcher. The dispatcher is
> shared for all netdevs. Each netdev allocate a slot in the dispatcher
> and a BPF program. The netdev then uses the dispatcher to call the
> correct program with a direct call (actually a tail-call).
> 
> Signed-off-by: Björn Töpel <bjorn.topel@...el.com>
> ---
>  arch/x86/net/bpf_jit_comp.c |  96 ++++++++++++++++++
>  kernel/bpf/Makefile         |   1 +
>  kernel/bpf/dispatcher.c     | 197 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 294 insertions(+)
>  create mode 100644 kernel/bpf/dispatcher.c
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 28782a1c386e..d75aebf508b8 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -10,10 +10,12 @@
>  #include <linux/if_vlan.h>
>  #include <linux/bpf.h>
>  #include <linux/memory.h>
> +#include <linux/sort.h>
>  #include <asm/extable.h>
>  #include <asm/set_memory.h>
>  #include <asm/nospec-branch.h>
>  #include <asm/text-patching.h>
> +#include <asm/asm-prototypes.h>
>  
>  static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
>  {
> @@ -1471,6 +1473,100 @@ int arch_prepare_bpf_trampoline(void *image, struct btf_func_model *m, u32 flags
>  	return 0;
>  }
>  
> +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_RETPOLINE)
> +
> +/* Emits the dispatcher. Id lookup is limited to BPF_DISPATCHER_MAX,
> + * so it'll fit into PAGE_SIZE/2. The lookup is binary search: O(log
> + * n).
> + */
> +static int emit_bpf_dispatcher(u8 **pprog, int a, int b, u64 *progs,
> +			       u8 *fb)
> +{
> +	u8 *prog = *pprog, *jg_reloc;
> +	int pivot, err, cnt = 0;
> +	s64 jmp_offset;
> +
> +	if (a == b) {
> +		emit_mov_imm64(&prog, BPF_REG_0,	/* movabs func,%rax */
> +			       progs[a] >> 32,
> +			       (progs[a] << 32) >> 32);

Could you try optimizing emit_mov_imm64() to recognize s32 ?
iirc there was a single x86 insns that could move and sign extend.
That should cut down on bytecode size and probably make things a bit faster?
Another alternative is compare lower 32-bit only, since on x86-64 upper 32
should be ~0 anyway for bpf prog pointers.
Looking at bookkeeping code, I think I should be able to generalize bpf
trampoline a bit and share the code for bpf dispatch.
Could you also try aligning jmp target a bit by inserting nops?
Some x86 cpus are sensitive to jmp target alignment. Even without considering
JCC bug it could be helpful. Especially since we're talking about XDP/AF_XDP
here that will be pushing millions of calls through bpf dispatch.