linux-kernel - Re: [PATCH RFC v4 net-next 01/26] net: filter: add "load 64-bit immediate" eBPF instruction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVDrbD3goYmZsUdmEhVfaNxovyghCz6y+_q5+G+rVwtWg@mail.gmail.com>
Date:	Wed, 13 Aug 2014 14:17:24 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Alexei Starovoitov <ast@...mgrid.com>
Cc:	"David S. Miller" <davem@...emloft.net>,
	Ingo Molnar <mingo@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Daniel Borkmann <dborkman@...hat.com>,
	Chema Gonzalez <chema@...gle.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kees Cook <keescook@...omium.org>,
	Linux API <linux-api@...r.kernel.org>,
	Network Development <netdev@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC v4 net-next 01/26] net: filter: add "load 64-bit
 immediate" eBPF instruction

On Wed, Aug 13, 2014 at 2:02 PM, Alexei Starovoitov <ast@...mgrid.com> wrote:
> On Wed, Aug 13, 2014 at 11:35 AM, Andy Lutomirski <luto@...capital.net> wrote:
>>
>> The compiler can still think of it as a single insn, though, but some
>> future compiler might not.
>
> I think that would be very dangerous.
> compiler (user space) and kernel interpreter must have the same
> understanding of ISA.
>
>> In any case, I think that, if you use the
>> same code for high and for low, you need logic in the JIT that's at
>> least as complicated.
>
> why do you think so? Handling of pseudo BPF_LD_IMM64 is done
> in single patch #11 which is one of the smallest...
>
>> For example, what happens if you have two
>> consecutive 64-bit immediate loads to the same register?  Now you have
>> four consecutive 8-byte insn words that differ only in their immediate
>> values, and you need to split them correctly.
>
> I don't need to do anything special in this case.
> Two 16-byte instructions back to back is not a problem.
> Interpreter or JIT don't care whether they move the same or different
> immediates into the same or different register. Interpreter and JITs
> are dumb on purpose.
> when verifier sees two back to back ld_imm64, the 2nd will simply
> override the value loaded by first one. It's not any different than
> two back to back 'mov dst_reg, imm32' instructions.

But this patch makes the JIT code (and any interpreter) weirdly
stateful.  You have:

+               case BPF_LD | BPF_IMM | BPF_DW:
+                       /* movabsq %rax, imm64 */
+                       EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg));
+                       EMIT(insn->imm, 4);
+                       insn++;
+                       i++;
+                       EMIT(insn->imm, 4);
+                       break;

If you have more than two BPF_LD | BPF_IMM | BPF_DW instructions in a
row, then the way in which they pair up depends on where you start.

I think it would be a lot clearer if you made these be "load low" and
"load high", with JIT code like:

+               case BPF_LOAD_LOW:
+                       /* movabsq %rax, imm64 */
+                       if (next insn is BPF_LOAD_HIGH) {
+                           EMIT2(add_1mod(0x48, dst_reg),
add_1reg(0xB8, dst_reg));
+                           EMIT(insn->imm, 4);
+                           insn++;
+                           i++;
+                           EMIT(insn->imm, 4);
+                       } else {
+                           emit a real load low;
+                       }
+                       break;

(and you'd have to deal with whether load low by itself is illegal,
zero extends, sign extends, or preserves high bits).

Alternatively, and possibly better, you could have a real encoding for
multiword instructions.  Reserve a bit in the opcode to mark a
continuation of the previous instruction, and do:

+               case BPF_LD | BPF_IMM | BPF_DW:
+                       assert(insn[1] in bounds && insn[1].code == BPF_CONT);
+                       /* movabsq %rax, imm64 */
+                       EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg));
+                       EMIT(insn->imm, 4);
+                       insn++;
+                       i++;
+                       EMIT(insn->imm, 4);
+                       break;

This has a nice benefit for future-proofing: it gives you 119 bits of
payload for 16-byte instructions.

On the other hand, a u8 for the opcode is kind of small, and killing
half of that space like this is probably bad.  Maybe reserve two high
bits, with:

0: normal opcode or start of a multiword sequence
1: continuation of a multiword sequence
2, 3: reserved for future longer opcode numbers (e.g. 2 could indicate
that "code" is actually 16 bits)

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/