[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1393468732-3919-1-git-send-email-ast@plumgrid.com>
Date: Wed, 26 Feb 2014 18:38:51 -0800
From: Alexei Starovoitov <ast@...mgrid.com>
To: Daniel Borkmann <dborkman@...hat.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Ingo Molnar <mingo@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Tom Zanussi <tom.zanussi@...ux.intel.com>,
Jovi Zhangwei <jovi.zhangwei@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Pekka Enberg <penberg@....fi>,
Arjan van de Ven <arjan@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: [PATCH v3 net-next 0/1] bpf32->bpf64 mapper and bpf64 interpreter
Hi All,
V1 patches:
http://thread.gmane.org/gmane.linux.kernel/1605783
V2 patches:
http://thread.gmane.org/gmane.linux.kernel/1642325
V3 summary:
- as suggested by Daniel added on the fly converter from
old BPF (aka BPF32) into extended BPF (aka BPF64)
- as suggested by Peter Anvin added 32-bit subregisters
they don't add much to interpreter speed, but simplify bpf32->bpf64 mapping
- added sysctl net.core.bpf64_enable flag
if enabled, old BPF filters will be converted to BPF64
and will be used by tcpdump/cls/xtables.
safety of the filters is verified by old BPF sk_chk_filter()
BPF64's bpf_check() is dropped from this patch to simplify review
Addition of 32-bit subregs require some work on BPF64 x86_64 JIT, so
it's not included in this patch set. LLVM BPF64 backend also needs to be
taught to take advantage of 32-bit subregs.
Initially BPF64 instruction set was designed for max performance after JIT,
Now it was tweaked for good interpreter speeds as well.
Eventually BPF64 can completely replace existing BPF on all architectures.
Two key reasons why BPF64 interpreter is noticeably faster
than existing BPF32 interpreter:
1.fall-through jumps
In BPF32 jump instructions are forced to go either 'true' or 'false'
branch which causes branch-miss penalty.
BPF64 jump instructions have one branch and fall-through, which fit
CPU branch predictor logic better.
'perf stat' shows drastic difference for branch-misses.
2.jump-threaded implementation of interpreter vs switch statement
Instead of single tablejump at the top of 'switch' statement, GCC will
generate multiple tablejump instructions, which helps CPU branch predictor
Performance of two BPF filters generated by libpcap was measured
on x86_64, i386 and arm32.
fprog #1 is taken from Documentation/networking/filter.txt:
tcpdump -i eth0 port 22 -dd
fprog #2 is taken from 'man tcpdump':
tcpdump -i eth0 'tcp port 22 and (((ip[2:2] - ((ip[0]&0xf)<<2)) -
((tcp[12]&0xf0)>>2)) != 0)' -dd
Other libpcap programs have similar performance differences.
Raw performance data from BPF micro-benchmark:
SK_RUN_FILTER on same SKB (cache-hit) or 10k SKBs (cache-miss)
time in nsec per call, smaller is better
--x86_64--
fprog #1 fprog #1 fprog #2 fprog #2
cache-hit cache-miss cache-hit cache-miss
BPF32 90 98 207 220
BPF64 28 85 60 108
BPF32_JIT 12 33 17 44
BPF64_JIT TBD
--i386--
fprog #1 fprog #1 fprog #2 fprog #2
cache-hit cache-miss cache-hit cache-miss
BPF32 107 136 227 252
BPF64 40 119 69 172
--arm32--
fprog #1 fprog #1 fprog #2 fprog #2
cache-hit cache-miss cache-hit cache-miss
BPF32 202 300 475 540
BPF64 139 270 296 470
BPF32_JIT 26 182 37 202
BPF64_JIT TBD
on Intel cpus BPF64 interpreter is significantly faster than
old BPF interpreter. Existing BPF32_JIT is obviously even faster.
BPF64_JIT has similar performance.
Tested with Daniel's 'trinify BPF fuzzer'
TODO:
- bpf32->bpf64 converter doesn't recognize seccomp and negative
offsets yet, fix that
- add 32-bit subregs to BPF64 x86_64 JIT and LLVM backend
- add bpf64 verifier, so that tcpdump/cls/xt and others can
insert both bpf32 and bpf64 programs through the same interface
- add bpf tables, complete 'dropmonitor' and get back to
systemtap-like probes with bpf64
Please review.
Thanks!
Alexei Starovoitov (1):
bpf32->bpf64 mapper and bpf64 interpreter
include/linux/filter.h | 9 +-
include/linux/netdevice.h | 1 +
include/uapi/linux/filter.h | 37 ++-
net/core/Makefile | 2 +-
net/core/bpf_run.c | 766 +++++++++++++++++++++++++++++++++++++++++++
net/core/filter.c | 114 ++++++-
net/core/sysctl_net_core.c | 7 +
7 files changed, 913 insertions(+), 23 deletions(-)
create mode 100644 net/core/bpf_run.c
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists