lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNhvTdsBq_tmKNcxVdS=nro=jwL5yLxnyDXO02Vai+5YNg@mail.gmail.com>
Date:   Tue, 7 Jan 2020 11:14:17 +0100
From:   Björn Töpel <bjorn.topel@...il.com>
To:     Palmer Dabbelt <palmerdabbelt@...gle.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Netdev <netdev@...r.kernel.org>, linux-riscv@...ts.infradead.org,
        bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2 7/9] riscv, bpf: optimize calls

On Mon, 23 Dec 2019 at 19:58, Palmer Dabbelt <palmerdabbelt@...gle.com> wrote:
>
> On Mon, 16 Dec 2019 01:13:41 PST (-0800), Bjorn Topel wrote:
> > Instead of using emit_imm() and emit_jalr() which can expand to six
> > instructions, start using jal or auipc+jalr.
> >
> > Signed-off-by: Björn Töpel <bjorn.topel@...il.com>
> > ---
> >  arch/riscv/net/bpf_jit_comp.c | 101 +++++++++++++++++++++-------------
> >  1 file changed, 64 insertions(+), 37 deletions(-)
> >
> > diff --git a/arch/riscv/net/bpf_jit_comp.c b/arch/riscv/net/bpf_jit_comp.c
> > index 46cff093f526..8d7e3343a08c 100644
> > --- a/arch/riscv/net/bpf_jit_comp.c
> > +++ b/arch/riscv/net/bpf_jit_comp.c
> > @@ -811,11 +811,12 @@ static void emit_sext_32_rd(u8 *rd, struct rv_jit_context *ctx)
> >       *rd = RV_REG_T2;
> >  }
> >
> > -static void emit_jump_and_link(u8 rd, int rvoff, struct rv_jit_context *ctx)
> > +static void emit_jump_and_link(u8 rd, s64 rvoff, bool force_jalr,
> > +                            struct rv_jit_context *ctx)
> >  {
> >       s64 upper, lower;
> >
> > -     if (is_21b_int(rvoff)) {
> > +     if (rvoff && is_21b_int(rvoff) && !force_jalr) {
> >               emit(rv_jal(rd, rvoff >> 1), ctx);
> >               return;
> >       }
> > @@ -832,6 +833,28 @@ static bool is_signed_bpf_cond(u8 cond)
> >               cond == BPF_JSGE || cond == BPF_JSLE;
> >  }
> >
> > +static int emit_call(bool fixed, u64 addr, struct rv_jit_context *ctx)
> > +{
> > +     s64 off = 0;
> > +     u64 ip;
> > +     u8 rd;
> > +
> > +     if (addr && ctx->insns) {
> > +             ip = (u64)(long)(ctx->insns + ctx->ninsns);
> > +             off = addr - ip;
> > +             if (!is_32b_int(off)) {
> > +                     pr_err("bpf-jit: target call addr %pK is out of range\n",
> > +                            (void *)addr);
> > +                     return -ERANGE;
> > +             }
> > +     }
> > +
> > +     emit_jump_and_link(RV_REG_RA, off, !fixed, ctx);
> > +     rd = bpf_to_rv_reg(BPF_REG_0, ctx);
> > +     emit(rv_addi(rd, RV_REG_A0, 0), ctx);
>
> Why are they out of order?  It seems like it'd be better to just have the BPF
> calling convention match the RISC-V calling convention, as that'd avoid
> juggling the registers around.
>

BPF passes arguments in R1, R2, ..., and return value in R0. Given
that a0 plays the role of R1 and R0, how can we avoid the register
juggling (without complicating the JIT too much)? It would be nice
though... and ARM64 has the same concern AFAIK.

[...]
> > @@ -1599,36 +1611,51 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> >       for (i = 0; i < 16; i++) {
> >               pass++;
> >               ctx->ninsns = 0;
> > -             if (build_body(ctx, extra_pass)) {
> > +             if (build_body(ctx, extra_pass, ctx->offset)) {
> >                       prog = orig_prog;
> >                       goto out_offset;
> >               }
> >               build_prologue(ctx);
> >               ctx->epilogue_offset = ctx->ninsns;
> >               build_epilogue(ctx);
> > -             if (ctx->ninsns == prev_ninsns)
> > -                     break;
> > +
> > +             if (ctx->ninsns == prev_ninsns) {
> > +                     if (jit_data->header)
> > +                             break;
> > +
> > +                     image_size = sizeof(u32) * ctx->ninsns;
> > +                     jit_data->header =
> > +                             bpf_jit_binary_alloc(image_size,
> > +                                                  &jit_data->image,
> > +                                                  sizeof(u32),
> > +                                                  bpf_fill_ill_insns);
> > +                     if (!jit_data->header) {
> > +                             prog = orig_prog;
> > +                             goto out_offset;
> > +                     }
> > +
> > +                     ctx->insns = (u32 *)jit_data->image;
> > +                     /* Now, when the image is allocated, the image
> > +                      * can potentially shrink more (auipc/jalr ->
> > +                      * jal).
> > +                      */
> > +             }
>
> It seems like these fragments should go along with patch #2 that introduces the
> code, as I don't see anything above that makes this necessary here.
>

No, you're right.


Björn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ