lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 6 Mar 2016 17:58:28 -0800
From:	Alexei Starovoitov <ast@...com>
To:	"David S . Miller" <davem@...emloft.net>
CC:	Daniel Borkmann <daniel@...earbox.net>,
	Daniel Wagner <daniel.wagner@...-carit.de>,
	Tom Zanussi <tom.zanussi@...ux.intel.com>,
	Wang Nan <wangnan0@...wei.com>, He Kuang <hekuang@...wei.com>,
	Martin KaFai Lau <kafai@...com>,
	Brendan Gregg <brendan.d.gregg@...il.com>,
	<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<kernel-team@...com>
Subject: [PATCH net-next 0/9] bpf: hash map pre-alloc

Hi,

this path set switches bpf hash map to use pre-allocation by default
and introduces BPF_F_NO_PREALLOC flag to keep old behavior for cases
where full map pre-allocation is too memory expensive.

Some time back Daniel Wagner reported crashes when bpf hash map is
used to compute time intervals between preempt_disable->preempt_enable
and recently Tom Zanussi reported a dead lock in iovisor/bcc/funccount
tool if it's used to count the number of invocations of kernel
'*spin*' functions. Both problems are due to the recursive use of
slub and can only be solved by pre-allocating all map elements.

A lot of different solutions were considered. Many implemented,
but at the end pre-allocation seems to be the only feasible answer.
As far as pre-allocation goes it also was implemented 4 different ways:
- simple free-list with single lock
- percpu_ida with optimizations
- blk-mq-tag variant customized for bpf use case
- percpu_freelist
For bpf style of alloc/free patterns percpu_freelist is the best
and implemented in this patch set.
Detailed performance numbers in patch 3.
Patch 2 introduces percpu_freelist
Patch 1 fixes simple deadlocks due to missing recursion checks
Patches 4-7: prepare test infra
Patch 8: stress test for hash map infra. It attaches to spin_lock
functions and bpf_map_update/delete are called from different contexts
(except nmi, which is unsupported by bpf still)
Patch 9: map performance test

Reported-by: Daniel Wagner <daniel.wagner@...-carit.de>
Reported-by: Tom Zanussi <tom.zanussi@...ux.intel.com>

Alexei Starovoitov (9):
  bpf: prevent kprobe+bpf deadlocks
  bpf: introduce percpu_freelist
  bpf: pre-allocate hash map elements
  samples/bpf: make map creation more verbose
  samples/bpf: move ksym_search() into library
  samples/bpf: add map_flags to bpf loader
  samples/bpf: test both pre-alloc and normal maps
  samples/bpf: add bpf map stress test
  samples/bpf: add map performance test

 include/linux/bpf.h              |   4 +
 include/uapi/linux/bpf.h         |   3 +
 kernel/bpf/Makefile              |   2 +-
 kernel/bpf/hashtab.c             | 264 ++++++++++++++++++++++++++++-----------
 kernel/bpf/percpu_freelist.c     |  81 ++++++++++++
 kernel/bpf/percpu_freelist.h     |  31 +++++
 kernel/bpf/syscall.c             |  15 ++-
 kernel/trace/bpf_trace.c         |   2 -
 samples/bpf/Makefile             |   8 ++
 samples/bpf/bpf_helpers.h        |   1 +
 samples/bpf/bpf_load.c           |  70 ++++++++++-
 samples/bpf/bpf_load.h           |   6 +
 samples/bpf/fds_example.c        |   2 +-
 samples/bpf/libbpf.c             |   5 +-
 samples/bpf/libbpf.h             |   2 +-
 samples/bpf/map_perf_test_kern.c | 100 +++++++++++++++
 samples/bpf/map_perf_test_user.c | 155 +++++++++++++++++++++++
 samples/bpf/offwaketime_user.c   |  67 +---------
 samples/bpf/sock_example.c       |   2 +-
 samples/bpf/spintest_kern.c      |  59 +++++++++
 samples/bpf/spintest_user.c      |  50 ++++++++
 samples/bpf/test_maps.c          |  29 +++--
 samples/bpf/test_verifier.c      |   4 +-
 23 files changed, 802 insertions(+), 160 deletions(-)
 create mode 100644 kernel/bpf/percpu_freelist.c
 create mode 100644 kernel/bpf/percpu_freelist.h
 create mode 100644 samples/bpf/map_perf_test_kern.c
 create mode 100644 samples/bpf/map_perf_test_user.c
 create mode 100644 samples/bpf/spintest_kern.c
 create mode 100644 samples/bpf/spintest_user.c

-- 
2.6.5

Powered by blists - more mailing lists