netdev - Re: len = bpf_probe_read_str(); bpf_perf_event

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0839587a-9520-c844-61a3-01a7a30f0015@fb.com>
Date:   Tue, 14 Nov 2017 14:58:24 -0800
From:   Yonghong Song <yhs@...com>
To:     Daniel Borkmann <daniel@...earbox.net>,
        Arnaldo Carvalho de Melo <acme@...nel.org>
CC:     Gianluca Borello <g.borello@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        David Miller <davem@...emloft.net>,
        Linux Networking Development Mailing List 
        <netdev@...r.kernel.org>
Subject: Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) ==
 FAIL



On 11/14/17 12:25 PM, Daniel Borkmann wrote:
> On 11/14/2017 07:15 PM, Yonghong Song wrote:
>> On 11/14/17 6:19 AM, Daniel Borkmann wrote:
>>> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
>>>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
>>>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
>>>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
>>>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
>>>>>>>> libbpf: -- BEGIN DUMP LOG ---
>>>>>>>> libbpf:
>>>>>>>> 0: (79) r3 = *(u64 *)(r1 +104)
>>>>>>>> 1: (b7) r2 = 0
>>>>>>>> 2: (bf) r6 = r1
>>>>>>>> 3: (bf) r1 = r10
>>>>>>>> 4: (07) r1 += -128
>>>>>>>> 5: (b7) r2 = 128
>>>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>>>> 7: (bf) r1 = r0
>>>>>>>> 8: (07) r1 += -1
>>>>>>>> 9: (67) r1 <<= 32
>>>>>>>> 10: (77) r1 >>= 32
>>>>>>>> 11: (25) if r1 > 0x7f goto pc+11
>>>>>>>
>>>>>>> Right, so the compiler is optimizing the two tests into a single one above,
>>>>>>> which means lower bound cannot properly be derived again by the verifier due
>>>>>>> to this and thus you'll get the error. Similar issue was seen recently [1].
>>>>>>>
>>>>>>> Does the below hack work for you?
>>>>>>>
>>>>>>> int prog([...])
>>>>>>> {
>>>>>>>           char filename[128];
>>>>>>>           int ret = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>>>           if (ret > 0)
>>>>>>>                   bpf_perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename,
>>>>>>>                                         ret & (sizeof(filename) - 1));
>>>>>>>           return 1;
>>>>>>> }
>>>>>>>
>>>>>>> r0 should keep on tracking bounds here at least:
>>>>>>>
>>>>>>> prog:
>>>>>>>          0:    bf 16 00 00 00 00 00 00     r6 = r1
>>>>>>>          1:    bf a1 00 00 00 00 00 00     r1 = r10
>>>>>>>          2:    07 01 00 00 80 ff ff ff     r1 += -128
>>>>>>>          3:    b7 02 00 00 80 00 00 00     r2 = 128
>>>>>>>          4:    85 00 00 00 2d 00 00 00     call 45
>>>>>>>          5:    67 00 00 00 20 00 00 00     r0 <<= 32
>>>>>>>          6:    c7 00 00 00 20 00 00 00     r0 s>>= 32
>>>>>>>          7:    b7 01 00 00 01 00 00 00     r1 = 1
>>>>>>>          8:    6d 01 0a 00 00 00 00 00     if r1 s> r0 goto 10
>>>>>>>          9:    57 00 00 00 7f 00 00 00     r0 &= 127
>>>>>>>         10:    bf a4 00 00 00 00 00 00     r4 = r10
>>>>>>>         11:    07 04 00 00 80 ff ff ff     r4 += -128
>>>>>>>         12:    bf 61 00 00 00 00 00 00     r1 = r6
>>>>>>>         13:    18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     r2 = 0ll
>>>>>>>         15:    18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00     r3 = 4294967295ll
>>>>>>>         17:    bf 05 00 00 00 00 00 00     r5 = r0
>>>>>>>         18:    85 00 00 00 19 00 00 00     call 25
>>>>>>>
>>>>>>>     [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e=
>>>>>>
>>>>>> Not yet:
>>>>>>
>>>>>> 6: (85) call bpf_probe_read_str#45
>>>>>> 7: (bf) r1 = r0
>>>>>> 8: (67) r1 <<= 32
>>>>>> 9: (77) r1 >>= 32
>>>>>> 10: (15) if r1 == 0x0 goto pc+10
>>>>>>    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>>>> 11: (57) r0 &= 127
>>>>>> 12: (bf) r4 = r10
>>>>>> 13: (07) r4 += -128
>>>>>> 14: (bf) r1 = r6
>>>>>> 15: (18) r2 = 0xffff92bfc2aba840u
>>>>>> 17: (18) r3 = 0xffffffff
>>>>>> 19: (bf) r5 = r0
>>>>>> 20: (85) call bpf_perf_event_output#25
>>>>>> invalid stack type R4 off=-128 access_size=0
>>>>>>
>>>>>> I'll try updating clang/llvm...
>>>>>>
>>>>>> Full details:
>>>>>>
>>>>>> [root@...et bpf]# cat open.c
>>>>>> #include "bpf.h"
>>>>>>
>>>>>> SEC("prog=do_sys_open filename")
>>>>>> int prog(void *ctx, int err, const char __user *filename_ptr)
>>>>>> {
>>>>>>      char filename[128];
>>>>>>      const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
>>>>>
>>>>> Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
>>>>> could potentially return errors like -EFAULT.
>>>>
>>>> I changed to int, didn't help
>>>>   
>>>>> Currently having a version compiled from the git tree:
>>>>>
>>>>> # llc --version
>>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>>>     LLVM version 6.0.0git-2d810c2
>>>>>     Optimized build.
>>>>>     Default target: x86_64-unknown-linux-gnu
>>>>>     Host CPU: skylake
>>>>
>>>> [root@...et bpf]# llc --version
>>>> LLVM (https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
>>>>     LLVM version 4.0.0svn
>>>>
>>>> Old stuff! ;-) Will change, but improving these messages should be on
>>>> the radar, I think :-)
>>>
>>> Yep, agree, I think we need a generic, better solution for this type of
>>> issue instead of converting individual helpers to handle 0 min bound and
>>> then only bailing out in such case; need to brainstorm a bit on that.
>>>
>>> I think for the above in your case ...
>>>
>>>    [...]
>>>     6: (85) call bpf_probe_read_str#45
>>>     7: (bf) r1 = r0
>>>     8: (67) r1 <<= 32
>>>     9: (77) r1 >>= 32
>>>    10: (15) if r1 == 0x0 goto pc+10
>>>     R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(id=0,off=0,imm=0) R10=fp0
>>>    11: (57) r0 &= 127
>>>    [...]
>>>
>>> ... the shifts on r1 might be due to using 32 bit type, so if you find
>>> a way to avoid these and have the test on r0 directly, we might get there.
>>> Perhaps keep using a 64 bit type to avoid them. It would be useful to
>>> propagate the deduced bound information back to r0 when we know that
>>> neither r0 nor r1 has changed in the meantime.
>>
>> It is tricky to do in the bpf_program. Compiler tries hard to optimize :-).
>>
>> The issue is at "r0 &= 127".
>>
>> 9: (6d) if r1 s> r0 goto pc+10
>>   R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>> 10: R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>> 10: (57) r0 &= 127
>> 11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
>>
>> One possible solution for this problem is to relax the arg4 type
>> from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO.
> 
> Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
> but do we really want to add this hack everywhere? :( Potentially any function
> having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
> helper implementation and it ends up that progs start relying on this runtime
> check where we won't be able to get rid of it later on anymore.

The compiler actually does the right thing for the below code:
          int ret = bpf_probe_read_str(filename, sizeof(filename),
                                       filename_ptr);
          if (ret > 0)
            bpf_perf_event_output(ctx, &__bpf_stdout__,BPF_F_CURRENT_CPU,
                            filename, ret & (sizeof(filename) - 1));

Just from the above code without consulting bpf_probe_read_str 
internals, it is totally possible that ret = 128, then
ret & (sizeof(filename) - 1) = 0.
The issue is that the verifier did not set the "ret" initial range as 
(-inf, sizeof(filename) - 1). We could have this information associated 
with helper and feed back to verifier.

If we have this range, later for ret & (sizeof(filename) - 1) with ret 
 >= 1, the verifier should be able to conclude
  ret & (sizeof(filename) - 1) >= 1.

To workaround the immediate problem, I tested the following hack
with bcc and it works fine.

BPF_PERF_OUTPUT(events);
int trace(struct pt_regs *ctx) {
   char filename[128];
   int ret = bpf_probe_read_str(filename, sizeof(filename), 0);
   if (ret > 0) {
     if (ret == 1)
           events.perf_submit(ctx, filename, ret);
     else if (ret < 128)
           events.perf_submit(ctx, filename, ret);
   }
   return 1;
}

The idea is to make control flow more complex to prevent llvm
do certain optimizations.

> 
>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>> index a5580c6..a68d8bd 100644
>> --- a/kernel/trace/bpf_trace.c
>> +++ b/kernel/trace/bpf_trace.c
>> @@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>>                  },
>>          };
>>
>> +       if (unlikely(size == 0))
>> +               return 0;
>> +
>>          if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
>>                  return -EINVAL;
>>
>> @@ -407,7 +410,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = {
>>          .arg2_type      = ARG_CONST_MAP_PTR,
>>          .arg3_type      = ARG_ANYTHING,
>>          .arg4_type      = ARG_PTR_TO_MEM,
>> -       .arg5_type      = ARG_CONST_SIZE,
>> +       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
>>   };
>