linux-kernel - Re: [PATCH v7 net-next 1/3] filter: add Extended BPF interpreter and converter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMEtUuwyCD+-aWiOSRWf2+K0iaSxuDuW-Z8Em=LdskDgbtQkWA@mail.gmail.com>
Date:	Sun, 9 Mar 2014 11:02:40 -0700
From:	Alexei Starovoitov <ast@...mgrid.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	"David S. Miller" <davem@...emloft.net>,
	Daniel Borkmann <dborkman@...hat.com>,
	Ingo Molnar <mingo@...nel.org>, Will Drewry <wad@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"H. Peter Anvin" <hpa@...or.com>,
	Hagen Paul Pfeifer <hagen@...u.net>,
	Jesse Gross <jesse@...ira.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Tom Zanussi <tom.zanussi@...ux.intel.com>,
	Jovi Zhangwei <jovi.zhangwei@...il.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Pekka Enberg <penberg@....fi>,
	Arjan van de Ven <arjan@...radead.org>,
	Christoph Hellwig <hch@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH v7 net-next 1/3] filter: add Extended BPF interpreter and converter

On Sun, Mar 9, 2014 at 7:49 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Sat, 2014-03-08 at 15:15 -0800, Alexei Starovoitov wrote:
>
>> +                     if (BPF_SRC(fp->code) == BPF_K &&
>> +                         (int)fp->k < 0) {
>> +                             /* extended BPF immediates are signed,
>> +                              * zero extend immediate into tmp register
>> +                              * and use it in compare insn
>> +                              */
>> +                             insn->code = BPF_ALU | BPF_MOV | BPF_K;
>> +                             insn->a_reg = 2;
>> +                             insn->imm = fp->k;
>> +                             insn++;
>> +
>> +                             insn->a_reg = 6;
>> +                             insn->x_reg = 2;
>> +                             bpf_src = BPF_X;
>> +                     } else {
>> +                             insn->a_reg = 6;
>> +                             insn->x_reg = 7;
>> +                             insn->imm = fp->k;
>> +                             bpf_src = BPF_SRC(fp->code);
>> +                     }
>> +                     /* common case where 'jump_false' is next insn */
>> +                     if (fp->jf == 0) {
>> +                             insn->code = BPF_JMP | BPF_OP(fp->code) |
>> +                                     bpf_src;
>> +                             tgt = i + fp->jt + 1;
>> +                             EMIT_JMP;
>> +                             break;
>> +                     }
>> +                     /* convert JEQ into JNE when 'jump_true' is next insn */
>> +                     if (fp->jt == 0 && BPF_OP(fp->code) == BPF_JEQ) {
>> +                             insn->code = BPF_JMP | BPF_JNE | bpf_src;
>> +                             tgt = i + fp->jf + 1;
>> +                             EMIT_JMP;
>> +                             break;
>> +                     }
>> +                     /* other jumps are mapped into two insns: Jxx and JA */
>> +                     tgt = i + fp->jt + 1;
>> +                     insn->code = BPF_JMP | BPF_OP(fp->code) | bpf_src;
>> +                     EMIT_JMP;
>> +
>> +                     insn++;
>> +                     insn->code = BPF_JMP | BPF_JA;
>> +                     tgt = i + fp->jf + 1;
>> +                     EMIT_JMP;
>> +                     break;
>> +
>> +             /* ldxb 4*([14]&0xf) is remaped into 3 insns */
>> +             case BPF_LDX | BPF_MSH | BPF_B:
>> +                     insn->code = BPF_LD | BPF_ABS | BPF_B;
>> +                     insn->a_reg = 7;
>> +                     insn->imm = fp->k;
>> +
>> +                     insn++;
>> +                     insn->code = BPF_ALU | BPF_AND | BPF_K;
>> +                     insn->a_reg = 7;
>> +                     insn->imm = 0xf;
>> +
>> +                     insn++;
>> +                     insn->code = BPF_ALU | BPF_LSH | BPF_K;
>> +                     insn->a_reg = 7;
>> +                     insn->imm = 2;
>> +                     break;
>> +
>> +             /* RET_K, RET_A are remaped into 2 insns */
>> +             case BPF_RET | BPF_A:
>> +             case BPF_RET | BPF_K:
>> +                     insn->code = BPF_ALU | BPF_MOV |
>> +                             (BPF_RVAL(fp->code) == BPF_K ? BPF_K : BPF_X);
>> +                     insn->a_reg = 0;
>> +                     insn->x_reg = 6;
>> +                     insn->imm = fp->k;
>> +
>> +                     insn++;
>> +                     insn->code = BPF_RET | BPF_K;
>> +                     break;
>
>
> What the hell is this ?
>
> All this magical values, like 2, 6, 7, 10.

they are register numbers, since they are assigned into 'a_reg' and 'x_reg'
which are described in uapi/filter.h:
        __u8    a_reg:4; /* dest register */
        __u8    x_reg:4; /* source register */
and in Doc...filter.txt

In the V1 series I had a bunch of #define like:
#define R1 1
#define R2 2
which seemed as silly as doing '#define one 1'

I thought that the sk_convert_filter() code is pretty clear in terms
of what it's doing, but I'm happy to add an extensive comment to
describe the mechanics.
Also it felt that most of the time you and other folks want me to remove
comments, so I figured I'll add comments on demand.
Here looks like it's the case.

> I am afraid nobody will be able to read this but you.

that's certainly not the intent. I've presented it at the last plumbers conf
and would like to share more, since I think ebpf is a fundamental
breakthrough that can be used by many kernel subsystems.
This patch only covers old filters and seccomp.
We can do a lot more interesting things with tracing+ebpf and so on.

Regards,
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/