netdev - Re: [PATCH 5/5] MIPS: Add support for eBPF JIT.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc3e42b8-e2f6-c678-6658-9789934240fe@caviumnetworks.com>
Date:   Fri, 26 May 2017 09:10:06 -0700
From:   David Daney <ddaney@...iumnetworks.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        David Daney <david.daney@...ium.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mips@...ux-mips.org,
        ralf@...ux-mips.org, Markos Chandras <markos.chandras@...tec.com>
Subject: Re: [PATCH 5/5] MIPS: Add support for eBPF JIT.

On 05/25/2017 07:23 PM, Alexei Starovoitov wrote:
> On Thu, May 25, 2017 at 05:38:26PM -0700, David Daney wrote:
>> Since the eBPF machine has 64-bit registers, we only support this in
>> 64-bit kernels.  As of the writing of this commit log test-bpf is showing:
>>
>>    test_bpf: Summary: 316 PASSED, 0 FAILED, [308/308 JIT'ed]
>>
>> All current test cases are successfully compiled.
>>
>> Signed-off-by: David Daney <david.daney@...ium.com>
>> ---
>>   arch/mips/Kconfig       |    1 +
>>   arch/mips/net/bpf_jit.c | 1627 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   arch/mips/net/bpf_jit.h |    7 +
>>   3 files changed, 1633 insertions(+), 2 deletions(-)
> 
> Great stuff. I wonder what is the performance difference
> interpreter vs JIT

It depends if we are calling library code:

/proc/sys/net/core # echo 0 > bpf_jit_enable
/proc/sys/net/core # modprobe test-bpf test_id=275
test_bpf: #275 BPF_MAXINSNS: ld_abs+vlan_push/pop jited:0 131733 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [0/1 JIT'ed]
/proc/sys/net/core # rmmod test-bpf
/proc/sys/net/core # echo 1 > bpf_jit_enable
/proc/sys/net/core # modprobe test-bpf test_id=275
test_bpf: #275 BPF_MAXINSNS: ld_abs+vlan_push/pop jited:1 85453 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [1/1 JIT'ed]

About 1.5X faster.

Or doing atomic operations:

/proc/sys/net/core # rmmod test-bpf
/proc/sys/net/core # echo 0 > bpf_jit_enable
/proc/sys/net/core # modprobe test-bpf test_id=229
test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0 209020 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [0/1 JIT'ed]
/proc/sys/net/core # rmmod test-bpf
/proc/sys/net/core # echo 1 > bpf_jit_enable
/proc/sys/net/core # modprobe test-bpf test_id=229
test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:1 158004 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [1/1 JIT'ed]

About 1.3X faster, probably limited by coherent memory system more than 
code quality.

Simple register operations not touching memory are best:
/proc/sys/net/core # rmmod test-bpf
/proc/sys/net/core # echo 0 > bpf_jit_enable
/proc/sys/net/core # modprobe test-bpf test_id=38
test_bpf: #38 INT: ADD 64-bit jited:0 1819 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [0/1 JIT'ed]
/proc/sys/net/core # rmmod test-bpf
/proc/sys/net/core # echo 1 > bpf_jit_enable
/proc/sys/net/core # modprobe test-bpf test_id=38
test_bpf: #38 INT: ADD 64-bit jited:1 83 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [1/1 JIT'ed]

This one is fairly good. 21X faster.


> 
>> + * eBPF stack frame will be something like:
>> + *
>> + *  Entry $sp ------>   +--------------------------------+
>> + *                      |   $ra  (optional)              |
>> + *                      +--------------------------------+
>> + *                      |   $s0  (optional)              |
>> + *                      +--------------------------------+
>> + *                      |   $s1  (optional)              |
>> + *                      +--------------------------------+
>> + *                      |   $s2  (optional)              |
>> + *                      +--------------------------------+
>> + *                      |   $s3  (optional)              |
>> + *                      +--------------------------------+
>> + *                      |   tmp-storage  (if $ra saved)  |
>> + * $sp + tmp_offset --> +--------------------------------+ <--BPF_REG_10
>> + *                      |   BPF_REG_10 relative storage  |
>> + *                      |    MAX_BPF_STACK (optional)    |
>> + *                      |      .                         |
>> + *                      |      .                         |
>> + *                      |      .                         |
>> + *     $sp -------->    +--------------------------------+
>> + *
>> + * If BPF_REG_10 is never referenced, then the MAX_BPF_STACK sized
>> + * area is not allocated.
>> + */
> 
> It's especially great to see that you've put the tmp storage
> above program stack and made the stack allocation optional.
> At the moment I'm working on reducing bpf program stack size,
> so that JIT and interpreter can use only the stack they need.
> Looking at this JIT code only minimal changes will be needed.
> 

I originally recorded the minimum and maximum offsets from BPF_REG_10 
seen, and generated a minimally sized stack frame.  Then I see things like:

	{
		"STX_XADD_DW: Test side-effects, r10: 0x12 + 0x10 = 0x22",
		.u.insns_int = {
			BPF_ALU64_REG(BPF_MOV, R1, R10),
			BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
			BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
			BPF_STX_XADD(BPF_DW, R10, R0, -40),
			BPF_ALU64_REG(BPF_MOV, R0, R10),
			BPF_ALU64_REG(BPF_SUB, R0, R1),
			BPF_EXIT_INSN(),
		},
		INTERNAL,
		{ },
		{ { 0, 0 } },
	},

Here we see that the value of BPF_REG_10 can escape, and be used for who 
knows what, and we must assume the worst case.

I guess we could see if the BPF_REG_10 value ever escapes, and if it 
doesn't, then use an optimally sized stack frame, and only fall back to 
MAX_BPF_STACK if we cannot prove it is safe to do this.