netdev - Re: 32-bit zext JIT efficiency (Was Re: [PATCH bpf-next] selftests/bpf: two scale tests)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <871s1fsbrx.fsf@netronome.com>
Date:   Fri, 03 May 2019 18:12:02 +0100
From:   Jiong Wang <jiong.wang@...ronome.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Networking <netdev@...r.kernel.org>, bpf@...r.kernel.org,
        Jakub Kicinski <jakub.kicinski@...ronome.com>,
        "oss-drivers\@netronome.com" <oss-drivers@...ronome.com>
Subject: Re: 32-bit zext JIT efficiency (Was Re: [PATCH bpf-next] selftests/bpf: two scale tests)

Jiong Wang writes:

>> > if you can craft a test that shows patch_insn issue before your set,
>> > then it's ok to hack bpf_fill_scale1 to use alu64.
>>
>> As described above, does the test_verifier 732 + jit blinding looks convincing?
>>
>> > I would also prefer to go with option 2 (new zext insn) for JITs.
>>
>> Got it.
>
> I followed option 2 and have sent out v5 with latests changes/fixes:

Had done second look at various back-ends, now noticed one new issue,
some arches are not consistent on implicit zext. For example, for s390,
alu32 move could be JITed using single instruction "llgfr" which will do
implicit zext, but alu32 move on PowerPC needs explicit zext. Then for
riscv, all BPF_ALU | BPF_K needs zext but not for some of BPF_ALU | BPF_X.
So, while these arches are generally better off after verifier zext
insertion enabled, but there do have unnecessary zext inserted by verifier
for them case by case.

Also, for 64-bit arches like PowerPC, S390 etc, they normally has zero
extended load, so narrowed load doesn't need extra zext insn, but for
32-bit arches like arm, narrowed load always need explicit zext.

All these differences are because of BPF_ALU32 or BPF_LDX + B | H | W
will be eventually mapped to diversified back-ends which do not have
consistent ISA semantics.

Given all these, looks like pass down the analysis info to back-ends
and let them do the decision become the choice again?

Regards,
Jiong

> The major changes are:
>   - introduced BPF_ZEXT, even though it doesn't resolve insn patch in-efficient,
>     but could let JIT back-ends do optimal code-gen, and the change is small,
>     so perhap just better to support it in this set.
>   - while look insn patch code, I feel patched-insn need to be conservatiely
>     marked if any insn inside patch buffer define sub-register.
>   - Also fixed helper function return value handling bug. I am thinking helper
>     function should have accurate return value type description, otherwise
>     there could be bug. For example arm32 back-end just executes the native
>     helper functions and doesn't do anything special on the return value. So
>     a function returns u32 would only set native reg r0, not r1 in the pair.
>     Then if the outside eBPF insn is casting it into u64, there needs to be
>     zext.
>   - adjusted test_verifier to make sure it could pass on hosts w and w/o hw
>     zext.
>
> For more info, please see the cover letter and patch description at v5.
>
> Thanks.
> Regards,
> Jiong