lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20180425110055.GA7313@udknight> Date: Wed, 25 Apr 2018 19:00:55 +0800 From: Wang YanQing <udknight@...il.com> To: Daniel Borkmann <daniel@...earbox.net> Cc: ast@...nel.org, illusionist.neo@...il.com, tglx@...utronix.de, mingo@...hat.com, hpa@...or.com, davem@...emloft.net, x86@...nel.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: [PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32 On Wed, Apr 25, 2018 at 02:11:16AM +0200, Daniel Borkmann wrote: > On 04/19/2018 05:54 PM, Wang YanQing wrote: > > The JIT compiler emits ia32 bit instructions. Currently, It supports > > eBPF only. Classic BPF is supported because of the conversion by BPF core. > > > > Almost all instructions from eBPF ISA supported except the following: > > BPF_ALU64 | BPF_DIV | BPF_K > > BPF_ALU64 | BPF_DIV | BPF_X > > BPF_ALU64 | BPF_MOD | BPF_K > > BPF_ALU64 | BPF_MOD | BPF_X > > BPF_STX | BPF_XADD | BPF_W > > BPF_STX | BPF_XADD | BPF_DW > > > > It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too. > > > > ia32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI, > > and in these six registers, we can't treat all of them as real > > general purpose registers in jit: > > MUL instructions need EAX:EDX, shift instructions need ECX. > > > > So I decide to use stack to emulate all eBPF 64 registers, this will > > simplify the implementation a lot, because we don't need to face the > > flexible memory address modes on ia32, for example, we don't need to > > write below code pattern for one BPF_ADD instruction: > > > > if (src is a register && dst is a register) > > { > > //one instruction encoding for ADD instruction > > } else if (only src is a register) > > { > > //another different instruction encoding for ADD instruction > > } else if (only dst is a register) > > { > > //another different instruction encoding for ADD instruction > > } else > > { > > //src and dst are all on stack. > > //move src or dst to temporary registers > > } > > > > If the above example if-else-else-else isn't so painful, try to think > > it for BPF_ALU64|BPF_*SHIFT* instruction which we need to use many > > native instructions to emulate. > > > > Tested on my PC (Intel(R) Core(TM) i5-5200U CPU) and virtualbox. > > Just out of plain curiosity, do you have a specific use case on the > JIT for x86_32? Are you targeting this to be used for e.g. Atom or > the like (sorry, don't really have a good overview where x86_32 is > still in use these days)? Yes, you are right that x86_32 isn't the BIG DEAL anymore, but they willn't disappear completely in near future, the same as 32-bit linux kernel on 64bit machine. For me, because I use 32-bit linux on development/hacking notebook for many years, I try to trace/perf the kernel, then I meet eBPF, it is strange that there isn't a jit for it, so I try to fill the empty. > > > Testing results on i5-5200U: > > > > 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] > > 2) test_progs: Summary: 81 PASSED, 2 FAILED. > > test_progs report "libbpf: incorrect bpf_call opcode" for > > test_l4lb_noinline and test_xdp_noinline, because there is > > no llvm-6.0 on my machine, and current implementation doesn't > > support BPF_PSEUDO_CALL, so I think we can ignore the two failed > > testcases. > > 3) test_lpm: OK > > 4) test_lru_map: OK > > 5) test_verifier: Summary: 823 PASSED, 5 FAILED > > test_verifier report "invalid bpf_context access off=68 size=1/2/4/8" > > for all the 5 FAILED testcases with/without jit, we need to fix the > > failed testcases themself instead of this jit. > > Can you elaborate further on these? Looks like this definitely needs > fixing on 32 bit. Would be great to get a better understanding of the > underlying bug(s) and properly fix them. Ok! I will try to dig them out. > > > Above tests are all done with following flags settings discretely: > > 1:bpf_jit_enable=1 and bpf_jit_harden=0 > > 2:bpf_jit_enable=1 and bpf_jit_harden=2 > > > > Below are some numbers for this jit implementation: > > Note: > > I run test_progs in kselftest 100 times continuously for every testcase, > > the numbers are in format: total/times=avg. > > The numbers that test_bpf reports show almost the same relation. > > > > a:jit_enable=0 and jit_harden=0 b:jit_enable=1 and jit_harden=0 > > test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10057/100=100 > > test_pkt_access:PASS:ipv6:9130/100=91 test_pkt_access:PASS:ipv6:5055/100=50 > > test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:145945/100=1459 > > test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:67337/100=673 > > test_l4lb:PASS:ipv4:61100/100=611 test_l4lb:PASS:ipv4:38137/100=381 > > test_l4lb:PASS:ipv6:101000/100=1010 test_l4lb:PASS:ipv6:57779/100=577 > > > > c:jit_enable=1 and jit_harden=2 > > test_pkt_access:PASS:ipv4:12650/100=126 > > test_pkt_access:PASS:ipv6:7074/100=70 > > test_xdp:PASS:ipv4:147211/100=1472 > > test_xdp:PASS:ipv6:85783/100=857 > > test_l4lb:PASS:ipv4:53222/100=532 > > test_l4lb:PASS:ipv6:76322/100=763 > > > > Yes, the numbers are pretty when turn off jit_harden, if we want to speedup > > jit_harden, then we need to move BPF_REG_AX to *real* register instead of stack > > emulation, but when we do it, we need to face all the pain I describe above. We > > can do it in next step. > > > > See Documentation/networking/filter.txt for more information. > > > > Signed-off-by: Wang YanQing <udknight@...il.com> > [...] > > + /* call */ > > + case BPF_JMP | BPF_CALL: > > + { > > + const u8 *r1 = bpf2ia32[BPF_REG_1]; > > + const u8 *r2 = bpf2ia32[BPF_REG_2]; > > + const u8 *r3 = bpf2ia32[BPF_REG_3]; > > + const u8 *r4 = bpf2ia32[BPF_REG_4]; > > + const u8 *r5 = bpf2ia32[BPF_REG_5]; > > + > > + if (insn->src_reg == BPF_PSEUDO_CALL) > > + goto notyet; > > Here you bail out on BPF_PSEUDO_CALL, but in below bpf_int_jit_compile() you > implement half of e.g. 1c2a088a6626 ("bpf: x64: add JIT support for multi-function > programs"), which is rather confusing to leave it in this half complete state; > either remove altogether or implement everything. Done. > > > + func = (u8 *) __bpf_call_base + imm32; > > + jmp_offset = func - (image + addrs[i]); > > + > > + if (!imm32 || !is_simm32(jmp_offset)) { > > + pr_err("unsupported bpf func %d addr %p image %p\n", > > + imm32, func, image); > > + return -EINVAL; > > + } > > + > > + /* mov eax,dword ptr [ebp+off] */ > > + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp2[0]), > > + STACK_VAR(r1[0])); > > + /* mov edx,dword ptr [ebp+off] */ > > + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp2[1]), > > + STACK_VAR(r1[1])); > > + > > + emit_push_r64(r5, &prog); > > + emit_push_r64(r4, &prog); > > + emit_push_r64(r3, &prog); > > + emit_push_r64(r2, &prog); > > + > > + EMIT1_off32(0xE8, jmp_offset + 9); > > + > > + /* mov dword ptr [ebp+off],eax */ > > + EMIT3(0x89, add_2reg(0x40, IA32_EBP, tmp2[0]), > > + STACK_VAR(r0[0])); > > + /* mov dword ptr [ebp+off],edx */ > > + EMIT3(0x89, add_2reg(0x40, IA32_EBP, tmp2[1]), > > + STACK_VAR(r0[1])); > > + > > + /* add esp,32 */ > > + EMIT3(0x83, add_1reg(0xC0, IA32_ESP), 32); > > + break; > > + } > > + case BPF_JMP | BPF_TAIL_CALL: > > + emit_bpf_tail_call(&prog); > > + break; > > + > > + /* cond jump */ > > + case BPF_JMP | BPF_JEQ | BPF_X: > > + case BPF_JMP | BPF_JNE | BPF_X: > > + case BPF_JMP | BPF_JGT | BPF_X: > > + case BPF_JMP | BPF_JLT | BPF_X: > > + case BPF_JMP | BPF_JGE | BPF_X: > > + case BPF_JMP | BPF_JLE | BPF_X: > > + case BPF_JMP | BPF_JSGT | BPF_X: > > + case BPF_JMP | BPF_JSLE | BPF_X: > > + case BPF_JMP | BPF_JSLT | BPF_X: > > + case BPF_JMP | BPF_JSGE | BPF_X: > > + /* mov esi,dword ptr [ebp+off] */ > > + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(src_hi)); > > + /* cmp dword ptr [ebp+off], esi */ > > + EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(dst_hi)); > > + > > + EMIT2(IA32_JNE, 6); > > + /* mov esi,dword ptr [ebp+off] */ > > + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(src_lo)); > > + /* cmp dword ptr [ebp+off], esi */ > > + EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(dst_lo)); > > + goto emit_cond_jmp; > > + > > + case BPF_JMP | BPF_JSET | BPF_X: > > + /* mov esi,dword ptr [ebp+off] */ > > + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(dst_lo)); > > + /* and esi,dword ptr [ebp+off]*/ > > + EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(src_lo)); > > + > > + /* mov edi,dword ptr [ebp+off] */ > > + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[1]), > > + STACK_VAR(dst_hi)); > > + /* and edi,dword ptr [ebp+off] */ > > + EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[1]), > > + STACK_VAR(src_hi)); > > + /* or esi,edi */ > > + EMIT2(0x09, add_2reg(0xC0, tmp[0], tmp[1])); > > + goto emit_cond_jmp; > > + > > + case BPF_JMP | BPF_JSET | BPF_K: { > > + u32 hi; > > + > > + hi = imm32 & (1<<31) ? (u32)~0 : 0; > > + /* mov esi,imm32 */ > > + EMIT2_off32(0xC7, add_1reg(0xC0, tmp[0]), imm32); > > + /* and esi,dword ptr [ebp+off]*/ > > + EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(dst_lo)); > > + > > + /* mov esi,imm32 */ > > + EMIT2_off32(0xC7, add_1reg(0xC0, tmp[1]), hi); > > + /* and esi,dword ptr [ebp+off] */ > > + EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[1]), > > + STACK_VAR(dst_hi)); > > + /* or esi,edi */ > > + EMIT2(0x09, add_2reg(0xC0, tmp[0], tmp[1])); > > + goto emit_cond_jmp; > > + } > > + case BPF_JMP | BPF_JEQ | BPF_K: > > + case BPF_JMP | BPF_JNE | BPF_K: > > + case BPF_JMP | BPF_JGT | BPF_K: > > + case BPF_JMP | BPF_JLT | BPF_K: > > + case BPF_JMP | BPF_JGE | BPF_K: > > + case BPF_JMP | BPF_JLE | BPF_K: > > + case BPF_JMP | BPF_JSGT | BPF_K: > > + case BPF_JMP | BPF_JSLE | BPF_K: > > + case BPF_JMP | BPF_JSLT | BPF_K: > > + case BPF_JMP | BPF_JSGE | BPF_K: { > > + u32 hi; > > + > > + hi = imm32 & (1<<31) ? (u32)~0 : 0; > > + /* mov esi,imm32 */ > > + EMIT2_off32(0xC7, add_1reg(0xC0, tmp[0]), hi); > > + /* cmp dword ptr [ebp+off],esi */ > > + EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(dst_hi)); > > + > > + EMIT2(IA32_JNE, 6); > > + /* mov esi,imm32 */ > > + EMIT2_off32(0xC7, add_1reg(0xC0, tmp[0]), imm32); > > + /* cmp dword ptr [ebp+off],esi */ > > + EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]), > > + STACK_VAR(dst_lo)); > > + > > +emit_cond_jmp: /* convert BPF opcode to x86 */ > > + switch (BPF_OP(code)) { > > + case BPF_JEQ: > > + jmp_cond = IA32_JE; > > + break; > > + case BPF_JSET: > > + case BPF_JNE: > > + jmp_cond = IA32_JNE; > > + break; > > + case BPF_JGT: > > + /* GT is unsigned '>', JA in x86 */ > > + jmp_cond = IA32_JA; > > + break; > > + case BPF_JLT: > > + /* LT is unsigned '<', JB in x86 */ > > + jmp_cond = IA32_JB; > > + break; > > + case BPF_JGE: > > + /* GE is unsigned '>=', JAE in x86 */ > > + jmp_cond = IA32_JAE; > > + break; > > + case BPF_JLE: > > + /* LE is unsigned '<=', JBE in x86 */ > > + jmp_cond = IA32_JBE; > > + break; > > + case BPF_JSGT: > > + /* signed '>', GT in x86 */ > > + jmp_cond = IA32_JG; > > + break; > > + case BPF_JSLT: > > + /* signed '<', LT in x86 */ > > + jmp_cond = IA32_JL; > > + break; > > + case BPF_JSGE: > > + /* signed '>=', GE in x86 */ > > + jmp_cond = IA32_JGE; > > + break; > > + case BPF_JSLE: > > + /* signed '<=', LE in x86 */ > > + jmp_cond = IA32_JLE; > > + break; > > + default: /* to silence gcc warning */ > > + return -EFAULT; > > + } > > + jmp_offset = addrs[i + insn->off] - addrs[i]; > > + if (is_imm8(jmp_offset)) { > > + EMIT2(jmp_cond, jmp_offset); > > + } else if (is_simm32(jmp_offset)) { > > + EMIT2_off32(0x0F, jmp_cond + 0x10, jmp_offset); > > + } else { > > + pr_err("cond_jmp gen bug %llx\n", jmp_offset); > > + return -EFAULT; > > + } > > + > > + break; > > + } > > + case BPF_JMP | BPF_JA: > > + jmp_offset = addrs[i + insn->off] - addrs[i]; > > + if (!jmp_offset) > > + /* optimize out nop jumps */ > > + break; > > +emit_jmp: > > + if (is_imm8(jmp_offset)) { > > + EMIT2(0xEB, jmp_offset); > > + } else if (is_simm32(jmp_offset)) { > > + EMIT1_off32(0xE9, jmp_offset); > > + } else { > > + pr_err("jmp gen bug %llx\n", jmp_offset); > > + return -EFAULT; > > + } > > + break; > > + > > + case BPF_LD | BPF_IND | BPF_W: > > + func = sk_load_word; > > + goto common_load; > > + case BPF_LD | BPF_ABS | BPF_W: > > + func = CHOOSE_LOAD_FUNC(imm32, sk_load_word); > > +common_load: > > + jmp_offset = func - (image + addrs[i]); > > + if (!func || !is_simm32(jmp_offset)) { > > + pr_err("unsupported bpf func %d addr %p image %p\n", > > + imm32, func, image); > > + return -EINVAL; > > + } > > + if (BPF_MODE(code) == BPF_ABS) { > > + /* mov %edx, imm32 */ > > + EMIT1_off32(0xBA, imm32); > > + } else { > > + /* mov edx,dword ptr [ebp+off] */ > > + EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), > > + STACK_VAR(src_lo)); > > + if (imm32) { > > + if (is_imm8(imm32)) > > + /* add %edx, imm8 */ > > + EMIT3(0x83, 0xC2, imm32); > > + else > > + /* add %edx, imm32 */ > > + EMIT2_off32(0x81, 0xC2, imm32); > > + } > > + } > > + emit_load_skb_data_hlen(&prog); > > The only place where you call the emit_load_skb_data_hlen() is here, so it > effectively doesn't cache anything as with the other JITs. I would suggest > to keep the complexity of the BPF_ABS/IND rather very low, remove the > arch/x86/net/bpf_jit32.S handling altogether and just follow the model the > way arm64 implemented this in the arch/arm64/net/bpf_jit_comp.c JIT. For > eBPF these instructions are legacy and have been source of many subtle > bugs in the various BPF JITs in the past, plus nobody complained about the > current interpreter speed for LD_ABS/IND on x86_32 anyway so far. ;) We're > also very close to have a rewrite of LD_ABS/IND out of native eBPF with > similar performance to be able to finally remove them from the core. Done. Thanks.
Powered by blists - more mailing lists