lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 6 Mar 2016 17:58:28 -0800 From: Alexei Starovoitov <ast@...com> To: "David S . Miller" <davem@...emloft.net> CC: Daniel Borkmann <daniel@...earbox.net>, Daniel Wagner <daniel.wagner@...-carit.de>, Tom Zanussi <tom.zanussi@...ux.intel.com>, Wang Nan <wangnan0@...wei.com>, He Kuang <hekuang@...wei.com>, Martin KaFai Lau <kafai@...com>, Brendan Gregg <brendan.d.gregg@...il.com>, <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>, <kernel-team@...com> Subject: [PATCH net-next 0/9] bpf: hash map pre-alloc Hi, this path set switches bpf hash map to use pre-allocation by default and introduces BPF_F_NO_PREALLOC flag to keep old behavior for cases where full map pre-allocation is too memory expensive. Some time back Daniel Wagner reported crashes when bpf hash map is used to compute time intervals between preempt_disable->preempt_enable and recently Tom Zanussi reported a dead lock in iovisor/bcc/funccount tool if it's used to count the number of invocations of kernel '*spin*' functions. Both problems are due to the recursive use of slub and can only be solved by pre-allocating all map elements. A lot of different solutions were considered. Many implemented, but at the end pre-allocation seems to be the only feasible answer. As far as pre-allocation goes it also was implemented 4 different ways: - simple free-list with single lock - percpu_ida with optimizations - blk-mq-tag variant customized for bpf use case - percpu_freelist For bpf style of alloc/free patterns percpu_freelist is the best and implemented in this patch set. Detailed performance numbers in patch 3. Patch 2 introduces percpu_freelist Patch 1 fixes simple deadlocks due to missing recursion checks Patches 4-7: prepare test infra Patch 8: stress test for hash map infra. It attaches to spin_lock functions and bpf_map_update/delete are called from different contexts (except nmi, which is unsupported by bpf still) Patch 9: map performance test Reported-by: Daniel Wagner <daniel.wagner@...-carit.de> Reported-by: Tom Zanussi <tom.zanussi@...ux.intel.com> Alexei Starovoitov (9): bpf: prevent kprobe+bpf deadlocks bpf: introduce percpu_freelist bpf: pre-allocate hash map elements samples/bpf: make map creation more verbose samples/bpf: move ksym_search() into library samples/bpf: add map_flags to bpf loader samples/bpf: test both pre-alloc and normal maps samples/bpf: add bpf map stress test samples/bpf: add map performance test include/linux/bpf.h | 4 + include/uapi/linux/bpf.h | 3 + kernel/bpf/Makefile | 2 +- kernel/bpf/hashtab.c | 264 ++++++++++++++++++++++++++++----------- kernel/bpf/percpu_freelist.c | 81 ++++++++++++ kernel/bpf/percpu_freelist.h | 31 +++++ kernel/bpf/syscall.c | 15 ++- kernel/trace/bpf_trace.c | 2 - samples/bpf/Makefile | 8 ++ samples/bpf/bpf_helpers.h | 1 + samples/bpf/bpf_load.c | 70 ++++++++++- samples/bpf/bpf_load.h | 6 + samples/bpf/fds_example.c | 2 +- samples/bpf/libbpf.c | 5 +- samples/bpf/libbpf.h | 2 +- samples/bpf/map_perf_test_kern.c | 100 +++++++++++++++ samples/bpf/map_perf_test_user.c | 155 +++++++++++++++++++++++ samples/bpf/offwaketime_user.c | 67 +--------- samples/bpf/sock_example.c | 2 +- samples/bpf/spintest_kern.c | 59 +++++++++ samples/bpf/spintest_user.c | 50 ++++++++ samples/bpf/test_maps.c | 29 +++-- samples/bpf/test_verifier.c | 4 +- 23 files changed, 802 insertions(+), 160 deletions(-) create mode 100644 kernel/bpf/percpu_freelist.c create mode 100644 kernel/bpf/percpu_freelist.h create mode 100644 samples/bpf/map_perf_test_kern.c create mode 100644 samples/bpf/map_perf_test_user.c create mode 100644 samples/bpf/spintest_kern.c create mode 100644 samples/bpf/spintest_user.c -- 2.6.5
Powered by blists - more mailing lists