[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eb0a8485-9624-1727-6913-e4520c9d8c04@fb.com>
Date: Tue, 9 Mar 2021 21:16:29 -0800
From: Yonghong Song <yhs@...com>
To: Florent Revest <revest@...omium.org>, <bpf@...r.kernel.org>
CC: <ast@...nel.org>, <daniel@...earbox.net>, <andrii@...nel.org>,
<kpsingh@...nel.org>, <jackmanb@...omium.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [BUG] One-liner array initialization with two pointers in BPF
results in NULLs
On 3/9/21 7:43 PM, Yonghong Song wrote:
>
>
> On 3/9/21 5:54 PM, Florent Revest wrote:
>> I noticed that initializing an array of pointers using this syntax:
>> __u64 array[] = { (__u64)&var1, (__u64)&var2 };
>> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF)
>> always results in array[0] and array[1] being NULL.
>>
>> Interestingly, if the array is only initialized with one pointer, ex:
>> __u64 array[] = { (__u64)&var1 };
>> Then array[0] will not be NULL.
>>
>> Or if the array is initialized field by field, ex:
>> __u64 array[2];
>> array[0] = (__u64)&var1;
>> array[1] = (__u64)&var2;
>> Then array[0] and array[1] will not be NULL either.
>>
>> I'm assuming that this should have something to do with relocations
>> and might be a bug in clang or in libbpf but because I don't know much
>> about these, I thought that reporting could be a good first step. :)
>
> Thanks for reporting. What you guess is correct, this is due to
> relocations :-(
>
> The compiler notoriously tend to put complex initial values into
> rodata section. For example, for
> __u64 array[] = { (__u64)&var1, (__u64)&var2 };
> the compiler will put
> { (__u64)&var1, (__u64)&var2 }
> into rodata section.
>
> But &var1 and &var2 themselves need relocation since they are
> address of static variables which will sit inside .data section.
>
> So in the elf file, you will see the following relocations:
>
> RELOCATION RECORDS FOR [.rodata]:
> OFFSET TYPE VALUE
> 0000000000000018 R_BPF_64_64 .data
> 0000000000000020 R_BPF_64_64 .data
>
> Currently, libbpf does not handle relocation inside .rodata
> section, so they content remains 0.
>
> That is why you see the issue with pointer as NULL.
>
> With array size of 1, compiler does not bother to put it into
> rodata section.
>
> I *guess* that it works in the macro due to some kind of heuristics,
> e.g., nested blocks, etc, and llvm did not promote the array init value
> to rodata. I will double check whether llvm can complete prevent
> such transformation.
>
> Maybe in the future libbpf is able to handle relocations for
> rodata section too. But for the time being, please just consider to use
> either macro, or the explicit array assignment.
Digging into the compiler, the compiler tries to make *const* initial
value into rodata section if the initial value size > 64, so in
this case, macro does not work either. I think this is how you
discovered the issue. The llvm does not provide target hooks to
influence this transformation.
So, there are two workarounds,
(1). __u64 param_working[2];
param_working[0] = (__u64)str1;
param_working[1] = (__u64)str2;
(2). BPF_SEQ_PRINTF(seq, "%s ", str1);
BPF_SEQ_PRINTF(seq, "%s", str2);
In practice, if you have at least one non-const format argument,
you should be fine. But if all format arguments are constant, then
none of them should be strings. Maybe we could change marco
unsigned long long ___param[] = { args };
to declare an array explicitly and then have a loop to
assign each array element?
>
> Thanks for the reproducer!
>
>>
>> I attached below a repro with a dummy selftest that I expect should pass
>> but fails to pass with the latest clang and bpf-next. Hopefully, the
>> logic should be simple: I try to print two strings from pointers in an
>> array using bpf_seq_printf but depending on how the array is initialized
>> the helper either receives the string pointers or NULL pointers:
>>
>> test_bug:FAIL:read unexpected read: actual 'str1= str2= str1=STR1
>> str2=STR2 ' != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 '
>>
>> Signed-off-by: Florent Revest <revest@...omium.org>
>> ---
>> tools/testing/selftests/bpf/prog_tests/bug.c | 41 +++++++++++++++++++
>> tools/testing/selftests/bpf/progs/test_bug.c | 43 ++++++++++++++++++++
>> 2 files changed, 84 insertions(+)
>> create mode 100644 tools/testing/selftests/bpf/prog_tests/bug.c
>> create mode 100644 tools/testing/selftests/bpf/progs/test_bug.c
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/bug.c
>> b/tools/testing/selftests/bpf/prog_tests/bug.c
>> new file mode 100644
>> index 000000000000..4b0fafd936b7
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/prog_tests/bug.c
>> @@ -0,0 +1,41 @@
>> +#include <test_progs.h>
>> +#include "test_bug.skel.h"
>> +
>> +static int duration;
>> +
>> +void test_bug(void)
>> +{
>> + struct test_bug *skel;
>> + struct bpf_link *link;
>> + char buf[64] = {};
>> + int iter_fd, len;
>> +
>> + skel = test_bug__open_and_load();
>> + if (CHECK(!skel, "test_bug__open_and_load",
>> + "skeleton open_and_load failed\n"))
>> + goto destroy;
>> +
>> + link = bpf_program__attach_iter(skel->progs.bug, NULL);
>> + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
>> + goto destroy;
>> +
>> + iter_fd = bpf_iter_create(bpf_link__fd(link));
>> + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
>> + goto free_link;
>> +
>> + len = read(iter_fd, buf, sizeof(buf));
>> + CHECK(len < 0, "read", "read failed: %s\n", strerror(errno));
>> + // BUG: We expect the strings to be printed in both cases but
>> only the
>> + // second case works.
>> + // actual 'str1= str2= str1=STR1 str2=STR2 '
>> + // != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 '
>> + ASSERT_STREQ(buf, "str1=STR1 str2=STR2 str1=STR1 str2=STR2 ",
>> "read");
>> +
>> + close(iter_fd);
>> +
>> +free_link:
>> + bpf_link__destroy(link);
>> +destroy:
>> + test_bug__destroy(skel);
>> +}
>> +
>> diff --git a/tools/testing/selftests/bpf/progs/test_bug.c
>> b/tools/testing/selftests/bpf/progs/test_bug.c
>> new file mode 100644
>> index 000000000000..c41e69483785
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_bug.c
>> @@ -0,0 +1,43 @@
>> +#include "bpf_iter.h"
>> +#include <bpf/bpf_helpers.h>
>> +#include <bpf/bpf_tracing.h>
>> +
>> +char _license[] SEC("license") = "GPL";
>> +
>> +SEC("iter/task")
>> +int bug(struct bpf_iter__task *ctx)
>> +{
>> + struct seq_file *seq = ctx->meta->seq;
>> +
>> + /* We want to print two strings */
>> + static const char fmt[] = "str1=%s str2=%s ";
>> + static char str1[] = "STR1";
>> + static char str2[] = "STR2";
>> +
>> + /*
>> + * Because bpf_seq_printf takes parameters to its format
>> specifiers in
>> + * an array, we need to stuff pointers to str1 and str2 in a u64
>> array.
>> + */
>> +
>> + /* First, we try a one-liner array initialization. Note that this is
>> + * what the BPF_SEQ_PRINTF macro does under the hood. */
>> + __u64 param_not_working[] = { (__u64)str1, (__u64)str2 };
>> + /* But we also try a field by field initialization of the array. We
>> + * would expect the arrays and the behavior to be exactly the
>> same. */
>> + __u64 param_working[2];
>> + param_working[0] = (__u64)str1;
>> + param_working[1] = (__u64)str2;
>> +
>> + /* For convenience, only print once */
>> + if (ctx->meta->seq_num != 0)
>> + return 0;
>> +
>> + /* Using the one-liner array of params, it does not print the
>> strings */
>> + bpf_seq_printf(seq, fmt, sizeof(fmt),
>> + param_not_working, sizeof(param_not_working));
>> + /* Using the field-by-field array of params, it prints the
>> strings */
>> + bpf_seq_printf(seq, fmt, sizeof(fmt),
>> + param_working, sizeof(param_working));
>> +
>> + return 0;
>> +}
>>
Powered by blists - more mailing lists