netdev - Re: [PATCH] bpf,x64: pad NOPs to make images converge more easily

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <X9gWStMFulD4DwHR@GaryWorkstation>
Date:   Tue, 15 Dec 2020 09:50:02 +0800
From:   Gary Lin <glin@...e.com>
To:     Daniel Borkmann <daniel@...earbox.net>
Cc:     netdev@...r.kernel.org, bpf@...r.kernel.org,
        Alexei Starovoitov <ast@...nel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        andreas.taschner@...e.com
Subject: Re: [PATCH] bpf,x64: pad NOPs to make images converge more easily

On Mon, Dec 14, 2020 at 04:31:44PM +0100, Daniel Borkmann wrote:
> On 12/14/20 9:15 AM, Gary Lin wrote:
> > On Mon, Dec 14, 2020 at 11:56:22AM +0800, Gary Lin wrote:
> > > On Fri, Dec 11, 2020 at 09:05:05PM +0100, Daniel Borkmann wrote:
> > > > On 12/11/20 9:19 AM, Gary Lin wrote:
> > > > > The x64 bpf jit expects bpf images converge within the given passes, but
> > > > > it could fail to do so with some corner cases. For example:
> > > > > 
> > > > >         l0:     ldh [4]
> > > > >         l1:     jeq #0x537d, l2, l40
> > > > >         l2:     ld [0]
> > > > >         l3:     jeq #0xfa163e0d, l4, l40
> > > > >         l4:     ldh [12]
> > > > >         l5:     ldx #0xe
> > > > >         l6:     jeq #0x86dd, l41, l7
> > > > >         l8:     ld [x+16]
> > > > >         l9:     ja 41
> > > > > 
> > > > >           [... repeated ja 41 ]
> > > > > 
> > > > >         l40:    ja 41
> > > > >         l41:    ret #0
> > > > >         l42:    ld #len
> > > > >         l43:    ret a
> > > > > 
> > > > > This bpf program contains 32 "ja 41" instructions which are effectively
> > > > > NOPs and designed to be replaced with valid code dynamically. Ideally,
> > > > > bpf jit should optimize those "ja 41" instructions out when translating
> > > > > the bpf instructions into x86_64 machine code. However, do_jit() can
> > > > > only remove one "ja 41" for offset==0 on each pass, so it requires at
> > > > > least 32 runs to eliminate those JMPs and exceeds the current limit of
> > > > > passes (20). In the end, the program got rejected when BPF_JIT_ALWAYS_ON
> > > > > is set even though it's legit as a classic socket filter.
> > > > > 
> > > > > To make the image more likely converge within 20 passes, this commit
> > > > > pads some instructions with NOPs in the last 5 passes:
> > > > > 
> > > > > 1. conditional jumps
> > > > >     A possible size variance comes from the adoption of imm8 JMP. If the
> > > > >     offset is imm8, we calculate the size difference of this BPF instruction
> > > > >     between the previous pass and the current pass and fill the gap with NOPs.
> > > > >     To avoid the recalculation of jump offset, those NOPs are inserted before
> > > > >     the JMP code, so we have to subtract the 2 bytes of imm8 JMP when
> > > > >     calculating the NOP number.
> > > > > 
> > > > > 2. BPF_JA
> > > > >     There are two conditions for BPF_JA.
> > > > >     a.) nop jumps
> > > > >       If this instruction is not optimized out in the previous pass,
> > > > >       instead of removing it, we insert the equivalent size of NOPs.
> > > > >     b.) label jumps
> > > > >       Similar to condition jumps, we prepend NOPs right before the JMP
> > > > >       code.
> > > > > 
> > > > > To make the code concise, emit_nops() is modified to use the signed len and
> > > > > return the number of inserted NOPs.
> > > > > 
> > > > > To support bpf-to-bpf, a new flag, padded, is introduced to 'struct bpf_prog'
> > > > > so that bpf_int_jit_compile() could know if the program is padded or not.
> > > > 
> > > > Please also add multiple hand-crafted test cases e.g. for bpf-to-bpf calls into
> > > > test_verifier (which is part of bpf kselftests) that would exercise this corner
> > > > case in x86 jit where we would start to nop pad so that there is proper coverage,
> > > > too.
> > > > 
> > > The corner case I had in the commit description is likely being rejected by
> > > the verifier because most of those "ja 41" are unreachable instructions.
> > > Is there any known test case that needs more than 15 passes in x86 jit?
> > > 
> > Just an idea. Besides the mentioned corner case, how about making
> > PADDING_PASSES dynamically configurable (sysfs?) and reusing the existing
> > test cases? So that we can have a script to set PADDING_PASSES from 1 to 20
> > and run the bpf selftests separately. This guarantees that the padding
> > strategy will be applied at least in a certain PADDING_PASSES settings.
> 
> I think exposing such implementation detail to users is not that great as they
> normally should not need to worry about these things (plus it's also rarely hit
> in practice when developing against llvm). On top of all that, such knob would
> have no meaning in case of other JITs since most other non-x86 ones have a fixed
> number of passes. I think it's probably useful for local testing of the fix, but
> less suitable for exposing as sysctl 'uapi' upstream. Re crafting a test case for
> bpf-2-bpf calls, you could orientate on bpf_fill_maxinsns10() in lib/test_bpf.c
> which is also triggering a high number of passes, port it over to test_verifier
> from selftests and experiment from there to integrate calls.
> 
Thanks for the hint. Will try bpf_fill_maxinsns10().

Gary Lin