[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <760D400C-2548-41B6-AE34-F89A66397A75@netronome.com>
Date: Sun, 28 Apr 2019 23:11:57 +0100
From: Jiong Wang <jiong.wang@...ronome.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>, daniel@...earbox.net,
netdev@...r.kernel.org, bpf@...r.kernel.org,
Jakub Kicinski <jakub.kicinski@...ronome.com>,
"oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: Re: 32-bit zext time complexity (Was Re: [PATCH bpf-next]
selftests/bpf: two scale tests)
> On 27 Apr 2019, at 04:05, Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:
>
> On Fri, Apr 26, 2019 at 02:06:33PM +0100, Jiong Wang wrote:
<snip>
>>
>>> Note that bpf_patch_insn_single() is calling bpf_adj_branches() twice too.
>>> And dead_code + convert_ctx + fixup_bpf_calls are calling
>>> bpf_patch_insn_single() a lot.
>>> How about before dead_code pass we convert the program into basic-block
>>> format, patch it all, and then convert from bb back to offsets.
>>> Patching will become very cheap, since no loop over program will be
>>> necessary. A jump from bb-N to bb-M will stay as-is regardless
>>> of amount of patching was done inside each bb.
>>> The loops inside these patching passes will be converted from:
>>> for (i = 0; i < insn_cnt; i++, insn++)
>>> into:
>>> for each bb
>>> for each insn in bb
>>
>> Interesting. If I am understanding correctly, BB then needs to support
>> dynamic insn buffer resize. And after all insn patching finished, all BBs
>> are finalized, we then linearized BBs (in a best order) to generate the
>> final bpf image.
>
> dynamic BB resize could be done similar to existing prog resize.
> It grows in page increments.
>
>>> As far as option 1 "also pass aux_insn information to JITs"...
>>> in theory it's fine, but looks like big refactoring to existing code.
>>
>> Will do quick explore, might turn out to be small change.
>>
>>> So if you want to make this bb conversion as step 2 and unblock the
>>> current patch set faster I suggest to go with option 2 "Introduce zero
>>> extension insn".
>>
>> A second think, even zero extension insn introduced, it is inserted after
>> the sub-register write insn, so we are still doing insert *not*
>> replacement, insn_delta inside bpf_patch_insn_single will be 1, so the slow
>> path will always be taken (memmove + bpf_prog_realloc + 2 x
>> bpf_adj_branches).
>
> ahh. right.
>
>> For the 1M-scale test, bpf_patch_insn_single is triggered because of hi32
>> poisoning, not lo32 zext. So we just need to change MOV32 to MOV64 in the
>> testcase which doesn't break the initial testing purpose of this testcase
>> from my understanding. This then could avoid 1M call to
>> bpf_patch_insn_single and pass the test after 32-bit opt patch set applied.
>>
>> Without this change and with hi32 randomization enabled, scale tests will
>> still hang before insn patch infra improved.
>>
>> @@ -228,7 +228,7 @@ static void bpf_fill_scale1(struct bpf_test *self)
>> - insn[i++] = BPF_ALU32_IMM(BPF_MOV, BPF_REG_0, 42);
>> + insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 42);
>>
>> This is change is not to paperover the underlying issue. We now know the
>> existing insn patch infra doesn't scale to million level, so could work on
>> improving it in the next step.
>
> I'm hesitant to step back.
> Do you see a program that can hit this patch_insn issue already?
> (I mean without your hi32/lo32 zext changes).
No really on real world program which I feel won't do that much insn patching.
But on test_verifier, could be reproduced through enabling jit blinding,
because the "scale" test contains imm insn.
For example, on bpf-next master, just run:
sysctl net/core/bpf_jit_enable=1
sysctl net/core/bpf_jit_harden=2
test_verifier 732 (732 is the test number for “scale: scale test1” on my env)
This enables constant blinding which also needs insn patching.
test 732 contains nearly 1M BPF_MOV_IMM to be blinded, so could
have similar effect as hi32 poisoning.
And benchmarking shows, once insn patch helper is called over 15000 times,
then the user could fell the load delay, and when it is called around
50000 times, it will take about half minutes to finish verification.
15000 : 3s
45000 : 29s
95000 : 125s
195000: 712s
>
>> At the same time the issue exposed from hi32 poisoning does raise alarm
>> that there could be the same issue for lo32 zext, therefore this patch set
>> doesn't scale if there are lots of insns to be zero extended, even though
>> this may not happen in real world bpf prog.
>>
>> IMHO, avoid using insn patching when possible might always be better. So,
>> if option 1 turns out to also generate clean patch set and introduce small
>> changes, I am going to follow it in the update version.
>>
>> Please let me know if you have different opinion.
>
> if you can craft a test that shows patch_insn issue before your set,
> then it's ok to hack bpf_fill_scale1 to use alu64.
As described above, does the test_verifier 732 + jit blinding looks convincing?
> I would also prefer to go with option 2 (new zext insn) for JITs.
Got it.
> I still don't have good feeling about option 1.
> Exposing all of aux_data to JITs may become a headache
> in the verifier development. It needs to be done carefully.
OK, understood.
Just for clarification, I thought to just add a field, something like
"bool *zext_dst" inside bpf_prog_aux. Then only copy
"env->insn_aux_data[*].zext_dst" to bpf_prog->aux->zext_dst, not copy all
aux_data generated by verifier. The field in bpf_prog could latter be extended
to a structure contains those analysis info verifier want to push down to
JIT back-ends, and JIT back-end must free them as soon as JIT compilation is
done.
Powered by blists - more mailing lists