linux-kernel - Re: [PATCH 4/4] arm64: bpf: Elide some moves to a0 after calls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJ+HfNjkacY-KStgGJMgvQh2=2OsMnH6Saij+nAPBqQrSJcNWw@mail.gmail.com>
Date:   Tue, 4 Feb 2020 20:13:17 +0100
From:   Björn Töpel <bjorn.topel@...il.com>
To:     Palmer Dabbelt <palmerdabbelt@...gle.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>, zlim.lnx@...il.com,
        catalin.marinas@....com, will@...nel.org,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        Andrii Nakryiko <andriin@...com>,
        Shuah Khan <shuah@...nel.org>, Netdev <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, linux-arm-kernel@...ts.infradead.org,
        LKML <linux-kernel@...r.kernel.org>,
        linux-kselftest@...r.kernel.org,
        clang-built-linux@...glegroups.com, kernel-team@...roid.com
Subject: Re: [PATCH 4/4] arm64: bpf: Elide some moves to a0 after calls

On Tue, 28 Jan 2020 at 03:15, Palmer Dabbelt <palmerdabbelt@...gle.com> wrote:
>
> On arm64, the BPF function ABI doesn't match the C function ABI.  Specifically,
> arm64 encodes calls as `a0 = f(a0, a1, ...)` while BPF encodes calls as
> `BPF_REG_0 = f(BPF_REG_1, BPF_REG_2, ...)`.  This discrepancy results in
> function calls being encoded as a two operations sequence that first does a C
> ABI calls and then moves the return register into the right place.  This
> results in one extra instruction for every function call.
>

It's a lot of extra work for one reg-to-reg move, but it always
annoyed me in the RISC-V JIT. :-) So, if it *can* be avoided, why not.

[...]
>
> +static int dead_register(const struct jit_ctx *ctx, int offset, int bpf_reg)

Given that a lot of archs (RISC-V, arm?, MIPS?) might benefit from
this, it would be nice if it could be made generic (it already is
pretty much), and moved to kernel/bpf.

> +{
> +       const struct bpf_prog *prog = ctx->prog;
> +       int i;
> +
> +       for (i = offset; i < prog->len; ++i) {
> +               const struct bpf_insn *insn = &prog->insnsi[i];
> +               const u8 code = insn->code;
> +               const u8 bpf_dst = insn->dst_reg;
> +               const u8 bpf_src = insn->src_reg;
> +               const int writes_dst = !((code & BPF_ST) || (code & BPF_STX)
> +                                        || (code & BPF_JMP32) || (code & BPF_JMP));
> +               const int reads_dst  = !((code & BPF_LD));
> +               const int reads_src  = true;
> +
> +               /* Calls are a bit special in that they clobber a bunch of regisers. */
> +               if ((code & (BPF_JMP | BPF_CALL)) || (code & (BPF_JMP | BPF_TAIL_CALL)))
> +                       if ((bpf_reg >= BPF_REG_0) && (bpf_reg <= BPF_REG_5))
> +                               return false;
> +
> +               /* Registers that are read before they're written are alive.
> +                * Most opcodes are of the form DST = DEST op SRC, but there
> +                * are some exceptions.*/
> +               if (bpf_src == bpf_reg && reads_src)
> +                       return false;
> +
> +               if (bpf_dst == bpf_reg && reads_dst)
> +                       return false;
> +
> +               if (bpf_dst == bpf_reg && writes_dst)
> +                       return true;
> +
> +               /* Most BPF instructions are 8 bits long, but some ar 16 bits
> +                * long. */

A bunch of spelling errors above.


Cheers,
Björn