[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVDrbD3goYmZsUdmEhVfaNxovyghCz6y+_q5+G+rVwtWg@mail.gmail.com>
Date: Wed, 13 Aug 2014 14:17:24 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Alexei Starovoitov <ast@...mgrid.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>,
Daniel Borkmann <dborkman@...hat.com>,
Chema Gonzalez <chema@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Kees Cook <keescook@...omium.org>,
Linux API <linux-api@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC v4 net-next 01/26] net: filter: add "load 64-bit
immediate" eBPF instruction
On Wed, Aug 13, 2014 at 2:02 PM, Alexei Starovoitov <ast@...mgrid.com> wrote:
> On Wed, Aug 13, 2014 at 11:35 AM, Andy Lutomirski <luto@...capital.net> wrote:
>>
>> The compiler can still think of it as a single insn, though, but some
>> future compiler might not.
>
> I think that would be very dangerous.
> compiler (user space) and kernel interpreter must have the same
> understanding of ISA.
>
>> In any case, I think that, if you use the
>> same code for high and for low, you need logic in the JIT that's at
>> least as complicated.
>
> why do you think so? Handling of pseudo BPF_LD_IMM64 is done
> in single patch #11 which is one of the smallest...
>
>> For example, what happens if you have two
>> consecutive 64-bit immediate loads to the same register? Now you have
>> four consecutive 8-byte insn words that differ only in their immediate
>> values, and you need to split them correctly.
>
> I don't need to do anything special in this case.
> Two 16-byte instructions back to back is not a problem.
> Interpreter or JIT don't care whether they move the same or different
> immediates into the same or different register. Interpreter and JITs
> are dumb on purpose.
> when verifier sees two back to back ld_imm64, the 2nd will simply
> override the value loaded by first one. It's not any different than
> two back to back 'mov dst_reg, imm32' instructions.
But this patch makes the JIT code (and any interpreter) weirdly
stateful. You have:
+ case BPF_LD | BPF_IMM | BPF_DW:
+ /* movabsq %rax, imm64 */
+ EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg));
+ EMIT(insn->imm, 4);
+ insn++;
+ i++;
+ EMIT(insn->imm, 4);
+ break;
If you have more than two BPF_LD | BPF_IMM | BPF_DW instructions in a
row, then the way in which they pair up depends on where you start.
I think it would be a lot clearer if you made these be "load low" and
"load high", with JIT code like:
+ case BPF_LOAD_LOW:
+ /* movabsq %rax, imm64 */
+ if (next insn is BPF_LOAD_HIGH) {
+ EMIT2(add_1mod(0x48, dst_reg),
add_1reg(0xB8, dst_reg));
+ EMIT(insn->imm, 4);
+ insn++;
+ i++;
+ EMIT(insn->imm, 4);
+ } else {
+ emit a real load low;
+ }
+ break;
(and you'd have to deal with whether load low by itself is illegal,
zero extends, sign extends, or preserves high bits).
Alternatively, and possibly better, you could have a real encoding for
multiword instructions. Reserve a bit in the opcode to mark a
continuation of the previous instruction, and do:
+ case BPF_LD | BPF_IMM | BPF_DW:
+ assert(insn[1] in bounds && insn[1].code == BPF_CONT);
+ /* movabsq %rax, imm64 */
+ EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg));
+ EMIT(insn->imm, 4);
+ insn++;
+ i++;
+ EMIT(insn->imm, 4);
+ break;
This has a nice benefit for future-proofing: it gives you 119 bits of
payload for 16-byte instructions.
On the other hand, a u8 for the opcode is kind of small, and killing
half of that space like this is probably bad. Maybe reserve two high
bits, with:
0: normal opcode or start of a multiword sequence
1: continuation of a multiword sequence
2, 3: reserved for future longer opcode numbers (e.g. 2 could indicate
that "code" is actually 16 bits)
--Andy
--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists