lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <177da568-8410-36d6-5f95-c5792ba47d62@fb.com>
Date:   Mon, 20 Dec 2021 22:33:59 -0800
From:   Yonghong Song <yhs@...com>
To:     Matteo Croce <mcroce@...ux.microsoft.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:     bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH bpf-next] bpf: limit bpf_core_types_are_compat() recursion



On 12/17/21 11:31 AM, Matteo Croce wrote:
>   On Wed, Dec 15, 2021 at 7:21 PM Matteo Croce
> <mcroce@...ux.microsoft.com> wrote:
>>
>> On Wed, Dec 15, 2021 at 6:29 PM Alexei Starovoitov
>> <alexei.starovoitov@...il.com> wrote:
>>>
>>> On Wed, Dec 15, 2021 at 6:54 AM Matteo Croce <mcroce@...ux.microsoft.com> wrote:
>>>>>
>>>>> Maybe do a level check here?
>>>>> Since calling it and immediately returning doesn't conserve
>>>>> the stack.
>>>>> If it gets called it can finish fine, but
>>>>> calling it again would be too much.
>>>>> In other words checking the level here gives us
>>>>> room for one more frame.
>>>>>
>>>>
>>>> I thought that the compiler was smart enough to return before
>>>> allocating most of the frame.
>>>> I tried and this is true only with gcc, not with clang.
>>>
>>> Interesting. That's a surprise.
>>> Could you share the asm that gcc generates?
>>>
>>
>> Sure,
>>
>> This is the gcc x86_64 asm on a stripped down
>> bpf_core_types_are_compat() with a 1k struct on the stack:
>>
>> bpf_core_types_are_compat:
>> test esi, esi
>> jle .L69
>> push r15
>> push r14
>> push r13
>> push r12
>> push rbp
>> mov rbp, rdi
>> push rbx
>> mov ebx, esi
>> sub rsp, 9112
>> [...]
>> .L69:
>> or eax, -1
>> ret
>>
>> This latest clang:
>>
>> bpf_core_types_are_compat: # @bpf_core_types_are_compat
>> push rbp
>> push r15
>> push r14
>> push rbx
>> sub rsp, 1000
>> mov r14d, -1
>> test esi, esi
>> jle .LBB0_7
>> [...]
>> .LBB0_7:
>> mov eax, r14d
>> add rsp, 1000
>> pop rbx
>> pop r14
>> pop r15
>> pop rbp
>> ret
>>
>>>>>> +                       err = __bpf_core_types_are_compat(local_btf, local_id,
>>>>>> +                                                         targ_btf, targ_id,
>>>>>> +                                                         level - 1);
>>>>>> +                       if (err <= 0)
>>>>>> +                               return err;
>>>>>> +               }
>>>>>> +
>>>>>> +               /* tail recurse for return type check */
>>>>>> +               btf_type_skip_modifiers(local_btf, local_type->type, &local_id);
>>>>>> +               btf_type_skip_modifiers(targ_btf, targ_type->type, &targ_id);
>>>>>> +               goto recur;
>>>>>> +       }
>>>>>> +       default:
>>>>>> +               pr_warn("unexpected kind %s relocated, local [%d], target [%d]\n",
>>>>>> +                       btf_type_str(local_type), local_id, targ_id);
>>>>>
>>>>> That should be bpf_log() instead.
>>>>>
>>>>
>>>> To do that I need a struct bpf_verifier_log, which is not present
>>>> there, neither in bpf_core_spec_match() or bpf_core_apply_relo_insn().
>>>
>>> It is there. See:
>>>          err = bpf_core_apply_relo_insn((void *)ctx->log, insn, ...
>>>
>>>> Should we drop the message at all?
>>>
>>> Passing it into bpf_core_spec_match() and further into
>>> bpf_core_types_are_compat() is probably unnecessary.
>>> All callers have an error check with a log right after.
>>> So I think we won't lose anything if we drop this log.
>>>
>>
>> Nice.
>>
>>>>
>>>>>> +               return 0;
>>>>>> +       }
>>>>>> +}
>>>>>
>>>>> Please add tests that exercise this logic by enabling
>>>>> additional lskels and a new test that hits the recursion limit.
>>>>> I suspect we don't have such case in selftests.
>>>>>
>>>>> Thanks!
>>>>
>>>> Will do!
>>>
>>> Thanks!
>>
> 
> Hi,
> 
> I'm writing a test which exercise that function.
> I can succesfully trigger a call to __bpf_core_types_are_compat() with
> these  calls:
> 
> bpf_core_type_id_kernel(struct sk_buff);
> bpf_core_type_exists(struct sk_buff);
> bpf_core_type_size(struct sk_buff);
> 
> but the kind will obviously be BTF_KIND_STRUCT.
> I'm trying to do the same with kind BTF_KIND_FUNC_PROTO instead with:
> 
> void func_proto(int, unsigned int);
> bpf_core_type_id_kernel(func_proto);
> 
> but I get a clang crash[1], while just checking the existence with:
> 
> typedef int (*func_proto_typedef)(struct sk_buff *);
> bpf_core_type_exists(func_proto_typedef);
> 
> I don't reach even bpf_core_spec_match().
> 
> Which is a simple way to generate a BTF_KIND_FUNC_PROTO BTF field?
> 
> [1] https://github.com/llvm/llvm-project/issues/52779

Thanks for Matteo. The error message is improved in 
https://reviews.llvm.org/D116063 to make it easy to understand the 
problem. I also posted the explanation here so other people, if hitting
a similar issue, can be aware of what is going on.

The following is a simple reproducible test case:

$ cat bug.c 

extern int do_smth(int); 
 

int test() { 
 

   return __builtin_btf_type_id(*(typeof(do_smth) *)do_smth, 1); 
 

} 
 

$ clang -target bpf -O2 -g -c bug.c 

fatal error: error in backend: Empty type name for BTF_TYPE_ID_REMOTE reloc
...
Let us try to reproduce the IR to see what is really going on with command,

clang -target bpf -O2 -g bug.c -emit-llvm -S -Xclang -disable-llvm-passes
The IR,

define dso_local i32 @test() #0 !dbg !7 {
entry:
   %0 = call i64 @llvm.bpf.btf.type.id(i32 0, i64 1), !dbg !12, 
!llvm.preserve.access.index !13
   %conv = trunc i64 %0 to i32, !dbg !12
   ret i32 %conv, !dbg !15
}
...
!7 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 2, 
type: !8, scopeLine: 2, flags: DIFlagAllCallsDescribed, spFlags: 
DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !11)
!8 = !DISubroutineType(types: !9)
!9 = !{!10}
!10 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!11 = !{}
!12 = !DILocation(line: 3, column: 10, scope: !7)
!13 = !DISubroutineType(types: !14)
!14 = !{!10, !10}
In the above, we really try to relocate a 'subroutine' (func pointer) 
type with debuginfo id 13 which is actually "int ()(int)". There are no 
actually name for type 13 and libbpf is not able to relocate for a 
function "int ()(int)" as it could have many matches.

https://reviews.llvm.org/D116063 improved the error message as below
to make it a little bit more evident what is the problem:

$ clang -target bpf -O2 -g -c bug.c 

fatal error: error in backend: SubroutineType not supported for 
BTF_TYPE_ID_REMOTE reloc


> 
> Regards,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ