[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87a7fuezs4.fsf@netronome.com>
Date: Fri, 10 May 2019 22:59:55 +0100
From: Jiong Wang <jiong.wang@...ronome.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Jiong Wang <jiong.wang@...ronome.com>, daniel@...earbox.net,
bpf@...r.kernel.org, netdev@...r.kernel.org,
oss-drivers@...ronome.com
Subject: Re: [PATCH v6 bpf-next 01/17] bpf: verifier: offer more accurate helper function arg and return type
Alexei Starovoitov writes:
> On Fri, May 10, 2019 at 09:30:28AM +0100, Jiong Wang wrote:
>>
>> Alexei Starovoitov writes:
>>
>> > On Thu, May 09, 2019 at 01:32:30PM +0100, Jiong Wang wrote:
>> >>
>> >> Alexei Starovoitov writes:
>> >>
>> >> > On Wed, May 08, 2019 at 03:45:12PM +0100, Jiong Wang wrote:
>> >> >>
>> >> >> I might be misunderstanding your points, please just shout if I am wrong.
>> >> >>
>> >> >> Suppose the following BPF code:
>> >> >>
>> >> >> unsigned helper(unsigned long long, unsigned long long);
>> >> >> unsigned long long test(unsigned *a, unsigned int c)
>> >> >> {
>> >> >> unsigned int b = *a;
>> >> >> c += 10;
>> >> >> return helper(b, c);
>> >> >> }
>> >> >>
>> >> >> We get the following instruction sequence by latest llvm
>> >> >> (-O2 -mattr=+alu32 -mcpu=v3)
>> >> >>
>> >> >> test:
>> >> >> 1: w1 = *(u32 *)(r1 + 0)
>> >> >> 2: w2 += 10
>> >> >> 3: call helper
>> >> >> 4: exit
>> >> >>
>> >> >> Argument Types
>> >> >> ===
>> >> >> Now instruction 1 and 2 are sub-register defines, and instruction 3, the
>> >> >> call, use them implicitly.
>> >> >>
>> >> >> Without the introduction of the new ARG_CONST_SIZE32 and
>> >> >> ARG_CONST_SIZE32_OR_ZERO, we don't know what should be done with w1 and
>> >> >> w2, zero-extend them should be fine for all cases, but could resulting in a
>> >> >> few unnecessary zero-extension inserted.
>> >> >
>> >> > I don't think we're on the same page.
>> >> > The argument type is _const_.
>> >> > In the example above they are not _const_.
>> >>
>> >> Right, have read check_func_arg + check_helper_mem_access again.
>> >>
>> >> Looks like ARG_CONST_SIZE* are designed for describing memory access size
>> >> for things like bounds checking. It must be a constant for stack access,
>> >> otherwise prog will be rejected, but it looks to me variables are allowed
>> >> for pkt/map access.
>> >>
>> >> But pkt/map has extra range info. So, indeed, ARG_CONST_SIZE32* are
>> >> unnecessary, the width could be figured out through the range.
>> >>
>> >> Will just drop this patch in next version.
>> >>
>> >> And sorry for repeating it again, I am still concerned on the issue
>> >> described at https://www.spinics.net/lists/netdev/msg568678.html.
>> >>
>> >> To be simple, zext insertion is based on eBPF ISA and assumes all
>> >> sub-register defines from alu32 or narrow loads need it if the underlying
>> >
>> > It's not an assumption. It's a requirement. If JIT is not zeroing
>> > upper 32-bits after 32-bit alu or narrow load it's a bug.
>> >
>> >> hardware arches don't do it. However, some arches support hardware zext
>> >> partially. For example, PowerPC, SPARC etc are 64-bit arches, while they
>> >> don't do hardware zext on alu32, they do it for narrow loads. And RISCV is
>> >> even more special, some alu32 has hardware zext, some don't.
>> >>
>> >> At the moment we have single backend hook "bpf_jit_hardware_zext", once a
>> >> backend enable it, verifier just insert zero extension for all identified
>> >> alu32 and narrow loads.
>> >>
>> >> Given verifier analysis info is not pushed down to JIT back-ends, verifier
>> >> needs more back-end info pushed up from back-ends. Do you think make sense
>> >> to introduce another hook "bpf_jit_hardware_zext_narrow_load" to at least
>> >> prevent unnecessary zext inserted for narrowed loads for arches like
>> >> PowerPC, SPARC?
>> >>
>> >> The hooks to control verifier zext insertion then becomes two:
>> >>
>> >> bpf_jit_hardware_zext_alu32
>> >> bpf_jit_hardware_zext_narrow_load
>> >
>> > and what to do with other combinations?
>> > Like in some cases narrow load on particular arch will be zero extended
>> > by hw and if it's misaligned or some other condition then it will not be?
>> > It doesn't feel that we can enumerate all such combinations.
>>
>> Yes, and above narrow_load is just an example. As mentioned, behaviour on
>> alu32 also diverse on some arches.
>>
>> > It feels 'bpf_jit_hardware_zext' backend hook isn't quite working.
>>
>> It is still useful for x86_64 and aarch64 to disable verifier insertion
>> pass completely. But then perhaps should be renamed into
>> "bpf_jit_verifier_zext". Returning false means verifier should disable the
>> insertion completely.
>
> I think the name is too cryptic.
> May be call it bpf_jit_needs_zext ?
Ack.
> x64/arm64 will set it false and the rest to true ?
Ack.
>> > It optimizes out some zext, but may be adding unnecessary extra zexts.
>>
>> This is exactly my concern.
>>
>> > May be it should be a global flag from the verifier unidirectional to JITs
>> > that will say "the verifier inserted MOV32 where necessary. JIT doesn't
>> > need to do zext manually".
>> > And then JITs will remove MOV32 when hw does it automatically.
>> > Removal should be easy, since such insn will be right after corresponding
>> > alu32 or narrow load.
>>
>> OK, so you mean do a simple peephole to combine insns. JIT looks forward
>> the next insn, if it is mov32 with dst_src == src_reg, then skip it. And
>> only do this when jitting a sub-register write eBPF insn and there is
>> hardware zext support.
>>
>> I guess such special mov32 won't be generated by compiler that it won't be
>> jump destination hence skip it is safe.
>>
>> For zero extension insertion part of this set, I am going to do the
>> following changes in next version:
>>
>> 1. verifier inserts special "mov32" (dst_reg == src_reg) as "zext".
>> JIT could still save zext for the other "mov32", but should always do
>> zext for this special "mov32".
>
> May be used mov32 with imm==1 as indicator that such mov32 is special?
OK, will go with it.
>
>> 2. rename 'bpf_jit_hardware_zext' to 'bpf_jit_verifier_zext' which
>> returns false at default to disable zext insertion.
>> 3. JITs want verifier zext override bpf_jit_verifier_zext to return
>> true and should skip unnecessary mov32 as described above.
>>
>> Looks good?
>
> Kinda makes sense, but when x64/arm64 are saying 'dont do zext'
> what verifier suppose to do? It will still do the analysis
> and liveness marks, but only mov32 won't be inserted?
Yes. The analysis part is still enabled, it is quite integrated with
existing liveness tracking infrastructure, and is not heavy.
zext insertion part is disabled.
> I guess that's fine, since BPF_F_TEST_RND_HI32 will use
> the results of the analysis?
Yes, hi32 poisoning needs it.
> Please double check that BPF_F_TEST_RND_HI32 poisoning works on
> 32-bit archs too.
OK. I don't have 32-bit host env at the moment, but will try to sort this
out.
Regards,
Jiong
Powered by blists - more mailing lists