[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNjkacY-KStgGJMgvQh2=2OsMnH6Saij+nAPBqQrSJcNWw@mail.gmail.com>
Date: Tue, 4 Feb 2020 20:13:17 +0100
From: Björn Töpel <bjorn.topel@...il.com>
To: Palmer Dabbelt <palmerdabbelt@...gle.com>
Cc: Daniel Borkmann <daniel@...earbox.net>,
Alexei Starovoitov <ast@...nel.org>, zlim.lnx@...il.com,
catalin.marinas@....com, will@...nel.org,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
Andrii Nakryiko <andriin@...com>,
Shuah Khan <shuah@...nel.org>, Netdev <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, linux-arm-kernel@...ts.infradead.org,
LKML <linux-kernel@...r.kernel.org>,
linux-kselftest@...r.kernel.org,
clang-built-linux@...glegroups.com, kernel-team@...roid.com
Subject: Re: [PATCH 4/4] arm64: bpf: Elide some moves to a0 after calls
On Tue, 28 Jan 2020 at 03:15, Palmer Dabbelt <palmerdabbelt@...gle.com> wrote:
>
> On arm64, the BPF function ABI doesn't match the C function ABI. Specifically,
> arm64 encodes calls as `a0 = f(a0, a1, ...)` while BPF encodes calls as
> `BPF_REG_0 = f(BPF_REG_1, BPF_REG_2, ...)`. This discrepancy results in
> function calls being encoded as a two operations sequence that first does a C
> ABI calls and then moves the return register into the right place. This
> results in one extra instruction for every function call.
>
It's a lot of extra work for one reg-to-reg move, but it always
annoyed me in the RISC-V JIT. :-) So, if it *can* be avoided, why not.
[...]
>
> +static int dead_register(const struct jit_ctx *ctx, int offset, int bpf_reg)
Given that a lot of archs (RISC-V, arm?, MIPS?) might benefit from
this, it would be nice if it could be made generic (it already is
pretty much), and moved to kernel/bpf.
> +{
> + const struct bpf_prog *prog = ctx->prog;
> + int i;
> +
> + for (i = offset; i < prog->len; ++i) {
> + const struct bpf_insn *insn = &prog->insnsi[i];
> + const u8 code = insn->code;
> + const u8 bpf_dst = insn->dst_reg;
> + const u8 bpf_src = insn->src_reg;
> + const int writes_dst = !((code & BPF_ST) || (code & BPF_STX)
> + || (code & BPF_JMP32) || (code & BPF_JMP));
> + const int reads_dst = !((code & BPF_LD));
> + const int reads_src = true;
> +
> + /* Calls are a bit special in that they clobber a bunch of regisers. */
> + if ((code & (BPF_JMP | BPF_CALL)) || (code & (BPF_JMP | BPF_TAIL_CALL)))
> + if ((bpf_reg >= BPF_REG_0) && (bpf_reg <= BPF_REG_5))
> + return false;
> +
> + /* Registers that are read before they're written are alive.
> + * Most opcodes are of the form DST = DEST op SRC, but there
> + * are some exceptions.*/
> + if (bpf_src == bpf_reg && reads_src)
> + return false;
> +
> + if (bpf_dst == bpf_reg && reads_dst)
> + return false;
> +
> + if (bpf_dst == bpf_reg && writes_dst)
> + return true;
> +
> + /* Most BPF instructions are 8 bits long, but some ar 16 bits
> + * long. */
A bunch of spelling errors above.
Cheers,
Björn
Powered by blists - more mailing lists