lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 9 Sep 2020 10:10:51 -0700
From:   Yonghong Song <yhs@...com>
To:     Andrii Nakryiko <andrii.nakryiko@...il.com>
CC:     bpf <bpf@...r.kernel.org>, Networking <netdev@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Kernel Team <kernel-team@...com>,
        Andrii Nakryiko <andriin@...com>
Subject: Re: [PATCH bpf-next v2] selftests/bpf: fix test_sysctl_loop{1,2}
 failure due to clang change



On 9/9/20 9:32 AM, Andrii Nakryiko wrote:
> On Tue, Sep 8, 2020 at 11:40 PM Yonghong Song <yhs@...com> wrote:
>>
>> Andrii reported that with latest clang, when building selftests, we have
>> error likes:
>>    error: progs/test_sysctl_loop1.c:23:16: in function sysctl_tcp_mem i32 (%struct.bpf_sysctl*):
>>    Looks like the BPF stack limit of 512 bytes is exceeded.
>>    Please move large on stack variables into BPF per-cpu array map.
>>
>> The error is triggered by the following LLVM patch:
>>    https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D87134&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=bsWiCrZT5nrmfm3_om52dUcH95ej2IV4N_-xOE1v14U&s=yM0hEwDFkjIegJg44-iJ8dZYLUk1F7YlKOm65VObloQ&e=
>>
>> For example, the following code is from test_sysctl_loop1.c:
>>    static __always_inline int is_tcp_mem(struct bpf_sysctl *ctx)
>>    {
>>      volatile char tcp_mem_name[] = "net/ipv4/tcp_mem/very_very_very_very_long_pointless_string";
>>      ...
>>    }
>> Without the above LLVM patch, the compiler did optimization to load the string
>> (59 bytes long) with 7 64bit loads, 1 8bit load and 1 16bit load,
>> occupying 64 byte stack size.
>>
>> With the above LLVM patch, the compiler only uses 8bit loads, but subregister is 32bit.
>> So stack requirements become 4 * 59 = 236 bytes. Together with other stuff on
>> the stack, total stack size exceeds 512 bytes, hence compiler complains and quits.
>>
>> To fix the issue, removing "volatile" key word or changing "volatile" to
>> "const"/"static const" does not work, the string is put in .rodata.str1.1 section,
>> which libbpf did not process it and errors out with
>>    libbpf: elf: skipping unrecognized data section(6) .rodata.str1.1
>>    libbpf: prog 'sysctl_tcp_mem': bad map relo against '.L__const.is_tcp_mem.tcp_mem_name'
>>            in section '.rodata.str1.1'
>>
>> Defining the string const as global variable can fix the issue as it puts the string constant
>> in '.rodata' section which is recognized by libbpf. In the future, when libbpf can process
>> '.rodata.str*.*' properly, the global definition can be changed back to local definition.
>>
>> Defining tcp_mem_name as a global, however, triggered a verifier failure.
>>     ./test_progs -n 7/21
>>    libbpf: load bpf program failed: Permission denied
>>    libbpf: -- BEGIN DUMP LOG ---
>>    libbpf:
>>    invalid stack off=0 size=1
>>    verification time 6975 usec
>>    stack depth 160+64
>>    processed 889 insns (limit 1000000) max_states_per_insn 4 total_states
>>    14 peak_states 14 mark_read 10
>>
>>    libbpf: -- END LOG --
>>    libbpf: failed to load program 'sysctl_tcp_mem'
>>    libbpf: failed to load object 'test_sysctl_loop2.o'
>>    test_bpf_verif_scale:FAIL:114
>>    #7/21 test_sysctl_loop2.o:FAIL
>> This actually exposed a bpf program bug. In test_sysctl_loop{1,2}, we have code
>> like
>>    const char tcp_mem_name[] = "<...long string...>";
>>    ...
>>    char name[64];
>>    ...
>>    for (i = 0; i < sizeof(tcp_mem_name); ++i)
>>        if (name[i] != tcp_mem_name[i])
>>            return 0;
>> In the above code, if sizeof(tcp_mem_name) > 64, name[i] access may be
>> out of bound. The sizeof(tcp_mem_name) is 59 for test_sysctl_loop1.c and
>> 79 for test_sysctl_loop2.c.
>>
>> Without promotion-to-global change, old compiler generates code where
>> the overflowed stack access is actually filled with valid value, so hiding
>> the bpf program bug. With promotion-to-global change, the code is different,
>> more specifically, the previous loading constants to stack is gone, and
>> "name" occupies stack[-64:0] and overflow access triggers a verifier error.
>> To fix the issue, adjust "name" buffer size properly.
>>
>> Reported-by: Andrii Nakryiko <andriin@...com>
>> Signed-off-by: Yonghong Song <yhs@...com>
>> ---
>>   tools/testing/selftests/bpf/progs/test_sysctl_loop1.c | 2 +-
>>   tools/testing/selftests/bpf/progs/test_sysctl_loop2.c | 5 +++--
>>   2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> Changelog:
>>    v1 -> v2:
>>      . The tcp_mem_name change actually triggers a verifier failure due to
>>        a bpf program bug. Fixing the bpf program bug can make test pass
>>        with both old and latest llvm. (Alexei)
>>
>> diff --git a/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c b/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
>> index 458b0d69133e..4b600b1f522f 100644
>> --- a/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
>> +++ b/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
>> @@ -18,9 +18,9 @@
>>   #define MAX_ULONG_STR_LEN 7
>>   #define MAX_VALUE_STR_LEN (TCP_MEM_LOOPS * MAX_ULONG_STR_LEN)
>>
>> +const char tcp_mem_name[] = "net/ipv4/tcp_mem/very_very_very_very_long_pointless_string";
>>   static __always_inline int is_tcp_mem(struct bpf_sysctl *ctx)
>>   {
>> -       volatile char tcp_mem_name[] = "net/ipv4/tcp_mem/very_very_very_very_long_pointless_string";
>>          unsigned char i;
>>          char name[64];
>>          int ret;
>> diff --git a/tools/testing/selftests/bpf/progs/test_sysctl_loop2.c b/tools/testing/selftests/bpf/progs/test_sysctl_loop2.c
>> index b2e6f9b0894d..d01056142520 100644
>> --- a/tools/testing/selftests/bpf/progs/test_sysctl_loop2.c
>> +++ b/tools/testing/selftests/bpf/progs/test_sysctl_loop2.c
>> @@ -18,11 +18,12 @@
>>   #define MAX_ULONG_STR_LEN 7
>>   #define MAX_VALUE_STR_LEN (TCP_MEM_LOOPS * MAX_ULONG_STR_LEN)
>>
>> +const char tcp_mem_name[] = "net/ipv4/tcp_mem/very_very_very_very_long_pointless_string_to_stress_byte_loop";
>>   static __attribute__((noinline)) int is_tcp_mem(struct bpf_sysctl *ctx)
>>   {
>> -       volatile char tcp_mem_name[] = "net/ipv4/tcp_mem/very_very_very_very_long_pointless_string_to_stress_byte_loop";
>>          unsigned char i;
>> -       char name[64];
>> +       /* the above tcp_mem_name length is 79, make name buffer length 80 */
>> +       char name[80];
> 
> 
> Wow, did you really count? Why not use sizeof(tcp_mem_name) and drop

The assembly code told me it is 79 and I double checked.

> the comment entirely?

Original intention is probably multiple of 8. At least memset will
have fewer instructions since all 64bit store. But I do see
sizeof(tcp_mem_name) is simpler and easier to understand. Will
respin and send v3.

> 
>>          int ret;
>>
>>          memset(name, 0, sizeof(name));
>> --
>> 2.24.1
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ