[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200902235340.2001300-1-yhs@fb.com>
Date: Wed, 2 Sep 2020 16:53:40 -0700
From: Yonghong Song <yhs@...com>
To: <bpf@...r.kernel.org>, Lorenz Bauer <lmb@...udflare.com>,
Martin KaFai Lau <kafai@...com>, <netdev@...r.kernel.org>
CC: Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>, <kernel-team@...com>
Subject: [PATCH bpf 0/2] bpf: do not use bucket_lock for hashmap iterator
Currently, the bpf hashmap iterator takes a bucket_lock, a spin_lock,
before visiting each element in the bucket. This will cause a deadlock
if a map update/delete operates on an element with the same
bucket id of the visited map.
To avoid the deadlock, let us just use rcu_read_lock instead of
bucket_lock. This may result in visiting stale elements, missing some elements,
or repeating some elements, if concurrent map delete/update happens for the
same map. I think using rcu_read_lock is a reasonable compromise.
For users caring stale/missing/repeating element issues, bpf map batch
access syscall interface can be used.
Note that another approach is during bpf_iter link stage, we check
whether the iter program might be able to do update/delete to the visited
map. If it is, reject the link_create. Verifier needs to record whether
an update/delete operation happens for each map for this approach.
I just feel this checking is too specialized, hence still prefer
rcu_read_lock approach.
Patch #1 has the kernel implementation and Patch #2 added a selftest
which can trigger deadlock without Patch #1.
Yonghong Song (2):
bpf: do not use bucket_lock for hashmap iterator
selftests/bpf: add bpf_{update,delete}_map_elem in hashmap iter
program
kernel/bpf/hashtab.c | 15 ++++-----------
.../selftests/bpf/progs/bpf_iter_bpf_hash_map.c | 15 +++++++++++++++
2 files changed, 19 insertions(+), 11 deletions(-)
--
2.24.1
Powered by blists - more mailing lists