[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMEtUuxog0w-XkYMHM=Zto_gejy8x5agxohAWOs1zPEtXTr0Sw@mail.gmail.com>
Date: Wed, 13 Aug 2014 10:44:04 -0700
From: Alexei Starovoitov <ast@...mgrid.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: "David S. Miller" <davem@...emloft.net>,
Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>,
Daniel Borkmann <dborkman@...hat.com>,
Chema Gonzalez <chema@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Kees Cook <keescook@...omium.org>,
Linux API <linux-api@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC v4 net-next 01/26] net: filter: add "load 64-bit
immediate" eBPF instruction
On Wed, Aug 13, 2014 at 9:08 AM, Andy Lutomirski <luto@...capital.net> wrote:
> On Wed, Aug 13, 2014 at 12:57 AM, Alexei Starovoitov <ast@...mgrid.com> wrote:
>> add BPF_LD_IMM64 instruction to load 64-bit immediate value into register.
>> All previous instructions were 8-byte. This is first 16-byte instruction.
>> Two consecutive 'struct bpf_insn' blocks are interpreted as single instruction:
>> insn[0/1].code = BPF_LD | BPF_DW | BPF_IMM
>> insn[0/1].dst_reg = destination register
>> insn[0].imm = lower 32-bit
>> insn[1].imm = upper 32-bit
>
> This might be unnecessarily difficult for fancy static analysis tools
> to reason about. Would it make sense to assign two different codes
> for this? For example, insn[0].code = code_for_load_low,
> insns[1].code = code_for_load_high, along with a verifier check that
> they come in matched pairs and that code_for_load_high isn't a jump
> target?
see my reply to David for the same thing. Short answer is that
sequence of instructions (even if it is a pair of instructions like this)
is very hard to detect in verifier and JITs.
As soon as we give compiler two instructions instead of one,
compiler may optimize them in a fancy ways. Like two loads of
64-bit immediate with upper 32-bit the same, may came out as
4 instructions: load_high, load_low, load_low, mov.
Or in some cases as single load_low, etc.
load 64-bit imm has to stay as single instruction to be verifiable
and patch-able easily.
One can argue: force compiler to emit load_low and load_hi
always together, but then that's exactly what I have. It's a single insn.
> (Something else that I find confusing about eBPF: the instruction
> mnemonics are very strange. Have you considered giving them real
> names? For example, load.imm.low instead of BPF_LD | BPF_DW | BPF_IMM
> is easier to read and pronounce.)
BPF_LD | BPF_DW | BPF_IMM is not really a name. It's macro
for cases when instructions are generated from inside the kernel.
Instructions mnemonics are not defined yet.
llvm emits assembler code like:
bpf_prog2:
ldw r1, 16(r1)
std -8(r10), r1
mov r1, 1
std -16(r10), r1
ld_64 r1, 1
mov r2, r10
addi r2, -8
call 4
jeqi r0, 0 goto .LBB1_2
ldd r1, 0(r0)
addi r1, 1
std 0(r0), r1
.LBB1_3:
mov r0, 0
ret
...
I'm open to change assembler/disassembler mnemonics.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists