[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <31605274-2146-1bb8-7625-8820f6948f6e@iogearbox.net>
Date: Mon, 6 May 2019 16:49:38 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Jiong Wang <jiong.wang@...ronome.com>, alexei.starovoitov@...il.com
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
oss-drivers@...ronome.com
Subject: Re: [PATCH v6 bpf-next 02/17] bpf: verifier: mark verified-insn with
sub-register zext flag
On 05/06/2019 03:49 PM, Daniel Borkmann wrote:
> On 05/03/2019 12:42 PM, Jiong Wang wrote:
>> eBPF ISA specification requires high 32-bit cleared when low 32-bit
>> sub-register is written. This applies to destination register of ALU32 etc.
>> JIT back-ends must guarantee this semantic when doing code-gen.
>>
>> x86-64 and arm64 ISA has the same semantic, so the corresponding JIT
>> back-end doesn't need to do extra work. However, 32-bit arches (arm, nfp
>> etc.) and some other 64-bit arches (powerpc, sparc etc), need explicit zero
>> extension sequence to meet such semantic.
>>
>> This is important, because for code the following:
>>
>> u64_value = (u64) u32_value
>> ... other uses of u64_value
>>
>> compiler could exploit the semantic described above and save those zero
>> extensions for extending u32_value to u64_value. Hardware, runtime, or BPF
>> JIT back-ends, are responsible for guaranteeing this. Some benchmarks show
>> ~40% sub-register writes out of total insns, meaning ~40% extra code-gen (
>> could go up to more for some arches which requires two shifts for zero
>> extension) because JIT back-end needs to do extra code-gen for all such
>> instructions.
>>
>> However this is not always necessary in case u32_value is never cast into
>> a u64, which is quite normal in real life program. So, it would be really
>> good if we could identify those places where such type cast happened, and
>> only do zero extensions for them, not for the others. This could save a lot
>> of BPF code-gen.
>>
>> Algo:
>> - Split read flags into READ32 and READ64.
>>
>> - Record indices of instructions that do sub-register def (write). And
>> these indices need to stay with reg state so path pruning and bpf
>> to bpf function call could be handled properly.
>>
>> These indices are kept up to date while doing insn walk.
>>
>> - A full register read on an active sub-register def marks the def insn as
>> needing zero extension on dst register.
>>
>> - A new sub-register write overrides the old one.
>>
>> A new full register write makes the register free of zero extension on
>> dst register.
>>
>> - When propagating read64 during path pruning, also marks def insns whose
>> defs are hanging active sub-register.
>>
>> Reviewed-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
>> Signed-off-by: Jiong Wang <jiong.wang@...ronome.com>
>
> [...]
>> +/* This function is supposed to be used by the following 32-bit optimization
>> + * code only. It returns TRUE if the source or destination register operates
>> + * on 64-bit, otherwise return FALSE.
>> + */
>> +static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn,
>> + u32 regno, struct bpf_reg_state *reg, enum reg_arg_type t)
>> +{
>> + u8 code, class, op;
>> +
>> + code = insn->code;
>> + class = BPF_CLASS(code);
>> + op = BPF_OP(code);
>> + if (class == BPF_JMP) {
>> + /* BPF_EXIT for "main" will reach here. Return TRUE
>> + * conservatively.
>> + */
>> + if (op == BPF_EXIT)
>> + return true;
>> + if (op == BPF_CALL) {
>> + /* BPF to BPF call will reach here because of marking
>> + * caller saved clobber with DST_OP_NO_MARK for which we
>> + * don't care the register def because they are anyway
>> + * marked as NOT_INIT already.
>> + */
>> + if (insn->src_reg == BPF_PSEUDO_CALL)
>> + return false;
>> + /* Helper call will reach here because of arg type
>> + * check.
>> + */
>> + if (t == SRC_OP)
>> + return helper_call_arg64(env, insn->imm, regno);
>> +
>> + return false;
>> + }
>> + }
>> +
>> + if (class == BPF_ALU64 || class == BPF_JMP ||
>> + /* BPF_END always use BPF_ALU class. */
>> + (class == BPF_ALU && op == BPF_END && insn->imm == 64))
>> + return true;
>
> For the BPF_JMP + JA case we don't look at registers, but I presume here
> we 'pretend' to use 64 bit regs to be more conservative as verifier would
> otherwise need to do more complex analysis at the jump target wrt zero
> extension, correct?
Hmm, scratch that last thought. Shouldn't it behave the same as with the
below class == BPF_JMP32 case?
>> + if (class == BPF_ALU || class == BPF_JMP32)
>> + return false;
>> +
>> + if (class == BPF_LDX) {
>> + if (t != SRC_OP)
>> + return BPF_SIZE(code) == BPF_DW;
>> + /* LDX source must be ptr. */
>> + return true;
>> + }
>> +
>> + if (class == BPF_STX) {
>> + if (reg->type != SCALAR_VALUE)
>> + return true;
>> + return BPF_SIZE(code) == BPF_DW;
>> + }
>> +
>> + if (class == BPF_LD) {
>> + u8 mode = BPF_MODE(code);
>> +
>> + /* LD_IMM64 */
>> + if (mode == BPF_IMM)
>> + return true;
>> +
>> + /* Both LD_IND and LD_ABS return 32-bit data. */
>> + if (t != SRC_OP)
>> + return false;
>> +
>> + /* Implicit ctx ptr. */
>> + if (regno == BPF_REG_6)
>> + return true;
>> +
>> + /* Explicit source could be any width. */
>> + return true;
>> + }
>> +
>> + if (class == BPF_ST)
>> + /* The only source register for BPF_ST is a ptr. */
>> + return true;
>> +
>> + /* Conservatively return true at default. */
>> + return true;
>> +}
Powered by blists - more mailing lists