[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEKGpzh3drL1ywEfnJWhAqULcjaqGi+8GZSwG9XV-iYK4DnCpA@mail.gmail.com>
Date: Thu, 9 Apr 2020 12:26:37 +0900
From: "Daniel T. Lee" <danieltimlee@...il.com>
To: bpf <bpf@...r.kernel.org>, netdev <netdev@...r.kernel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii.nakryiko@...il.com>,
Martin KaFai Lau <kafai@...com>
Subject: BPF program attached on BPF map function (read,write) is not working?
Currently, BPF program attached on BPF map function (read,write) is not called.
To be specific, the bpf kprobe program on 'htab_map_get_next_key'
doesn't called at all. To test this behavior, you can try ./tracex6
from the 'samples/bpf'. (It does not work properly at all)
By using 'git bisect', found the problem is derived from below commit.(v5.0-rc3)
commit 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock")
The code below is an excerpt of only the problematic code from the entire code.
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b155cd17c1bd..8577bb7f8be6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -713,8 +713,13 @@ static int map_lookup_elem(union bpf_attr *attr)
if (bpf_map_is_dev_bound(map)) {
err = bpf_map_offload_lookup_elem(map, key, value);
goto done;
}
preempt_disable();
+ this_cpu_inc(bpf_prog_active);
if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
err = bpf_percpu_hash_copy(map, key, value);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
err = bpf_percpu_array_copy(map, key, value);
@@ -744,7 +749,10 @@ static int map_lookup_elem(union bpf_attr *attr)
}
rcu_read_unlock();
}
+ this_cpu_dec(bpf_prog_active);
preempt_enable();
done:
if (err)
goto free_value;
As you can see from this snippet, bpf_prog_active value (flag I guess?)
increases and decreases within the code snippet. And this action create a
problem where bpf program on map is not called.
# kernel/trace/bpf_trace.c:74
unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
{
...
preempt_disable();
if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
/*
* since some bpf program is already running on this cpu,
* don't call into another bpf program (same or different)
* and don't send kprobe event into ring-buffer,
* so return zero here
*/
ret = 0;
goto out;
}
...
ret = BPF_PROG_RUN_ARRAY_CHECK(call->prog_array, ctx, BPF_PROG_RUN);
out:
__this_cpu_dec(bpf_prog_active);
preempt_enable();
So from trace_call_bpf() at kernel/trace/bpf_trace.c check whether
bpf_prog_active is 1, and if it is, it skips the execution of bpf program.
Back to latest Kernel 5.6, this this_cpu_{inc|dec}() has been wrapped with
bpf_{enable|disable}_instrumentation().
# include/linux/bpf.h
static inline void bpf_enable_instrumentation(void)
{
if (IS_ENABLED(CONFIG_PREEMPT_RT))
this_cpu_dec(bpf_prog_active);
else
__this_cpu_dec(bpf_prog_active);
migrate_enable();
}
And the functions which uses this wrapper are described below.
bpf_map_update_value
bpf_map_copy_value
map_delete_elem
generic_map_delete_batch
Which is basically most of the map operation.
So, I think this 'unable to attach bpf program on BPF map function (read,write)'
is a bug. Or is it desired action?
If it is a bug, bpf_{enable|disable}_instrumentation() should only
cover stackmap
as the upper commit intended. Not sure but adding another flag for
lock might work?
Or if this is an desired action, this should be covered at
documentation with a limitation
and tracex6 sample has to be removed.
Powered by blists - more mailing lists