lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Apr 2020 12:26:37 +0900
From:   "Daniel T. Lee" <danieltimlee@...il.com>
To:     bpf <bpf@...r.kernel.org>, netdev <netdev@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Martin KaFai Lau <kafai@...com>
Subject: BPF program attached on BPF map function (read,write) is not working?

Currently, BPF program attached on BPF map function (read,write) is not called.
To be specific, the bpf kprobe program on 'htab_map_get_next_key'
doesn't called at all. To test this behavior, you can try ./tracex6
from the 'samples/bpf'. (It does not work properly at all)

By using 'git bisect', found the problem is derived from below commit.(v5.0-rc3)
commit 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock")
The code below is an excerpt of only the problematic code from the entire code.

   diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
   index b155cd17c1bd..8577bb7f8be6 100644
   --- a/kernel/bpf/syscall.c
   +++ b/kernel/bpf/syscall.c
   @@ -713,8 +713,13 @@ static int map_lookup_elem(union bpf_attr *attr)

           if (bpf_map_is_dev_bound(map)) {
                   err = bpf_map_offload_lookup_elem(map, key, value);
                   goto done;
           }

           preempt_disable();
   +      this_cpu_inc(bpf_prog_active);
           if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
               map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
                   err = bpf_percpu_hash_copy(map, key, value);
           } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
                   err = bpf_percpu_array_copy(map, key, value);
   @@ -744,7 +749,10 @@ static int map_lookup_elem(union bpf_attr *attr)
                   }
                   rcu_read_unlock();
           }
   +      this_cpu_dec(bpf_prog_active);
           preempt_enable();

   done:
           if (err)
                   goto free_value;

As you can see from this snippet, bpf_prog_active value (flag I guess?)
increases and decreases within the code snippet. And this action create a
problem where bpf program on map is not called.

   # kernel/trace/bpf_trace.c:74
   unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
   {
       ...
        preempt_disable();

        if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
                /*
                 * since some bpf program is already running on this cpu,
                 * don't call into another bpf program (same or different)
                 * and don't send kprobe event into ring-buffer,
                 * so return zero here
                 */
                ret = 0;
                goto out;
        }
       ...
       ret = BPF_PROG_RUN_ARRAY_CHECK(call->prog_array, ctx, BPF_PROG_RUN);

   out:
       __this_cpu_dec(bpf_prog_active);
       preempt_enable();


So from trace_call_bpf() at kernel/trace/bpf_trace.c check whether
bpf_prog_active is 1, and if it is, it skips the execution of bpf program.

Back to latest Kernel 5.6, this this_cpu_{inc|dec}() has been wrapped with
bpf_{enable|disable}_instrumentation().

   # include/linux/bpf.h
   static inline void bpf_enable_instrumentation(void)
   {
           if (IS_ENABLED(CONFIG_PREEMPT_RT))
                   this_cpu_dec(bpf_prog_active);
           else
                   __this_cpu_dec(bpf_prog_active);
           migrate_enable();
   }

And the functions which uses this wrapper are described below.

   bpf_map_update_value
   bpf_map_copy_value
   map_delete_elem
   generic_map_delete_batch

Which is basically most of the map operation.

So, I think this 'unable to attach bpf program on BPF map function (read,write)'
is a bug. Or is it desired action?

If it is a bug, bpf_{enable|disable}_instrumentation() should only
cover stackmap
as the upper commit intended. Not sure but adding another flag for
lock might work?

Or if this is an desired action, this should be covered at
documentation with a limitation
and tracex6 sample has to be removed.

Powered by blists - more mailing lists