[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABRcYmK8m21sb8dHbr1wLT_oTCBpvr2Zg-8KHwKuJ2Ak0iTZ_A@mail.gmail.com>
Date: Wed, 10 Mar 2021 12:48:45 +0100
From: Florent Revest <revest@...omium.org>
To: Yonghong Song <yhs@...com>
Cc: bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
KP Singh <kpsingh@...nel.org>,
Brendan Jackman <jackmanb@...omium.org>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] One-liner array initialization with two pointers in BPF
results in NULLs
On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song <yhs@...com> wrote:
> On 3/9/21 7:43 PM, Yonghong Song wrote:
> > On 3/9/21 5:54 PM, Florent Revest wrote:
> >> I noticed that initializing an array of pointers using this syntax:
> >> __u64 array[] = { (__u64)&var1, (__u64)&var2 };
> >> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF)
> >> always results in array[0] and array[1] being NULL.
> >>
> >> Interestingly, if the array is only initialized with one pointer, ex:
> >> __u64 array[] = { (__u64)&var1 };
> >> Then array[0] will not be NULL.
> >>
> >> Or if the array is initialized field by field, ex:
> >> __u64 array[2];
> >> array[0] = (__u64)&var1;
> >> array[1] = (__u64)&var2;
> >> Then array[0] and array[1] will not be NULL either.
> >>
> >> I'm assuming that this should have something to do with relocations
> >> and might be a bug in clang or in libbpf but because I don't know much
> >> about these, I thought that reporting could be a good first step. :)
> >
> > Thanks for reporting. What you guess is correct, this is due to
> > relocations :-(
> >
> > The compiler notoriously tend to put complex initial values into
> > rodata section. For example, for
> > __u64 array[] = { (__u64)&var1, (__u64)&var2 };
> > the compiler will put
> > { (__u64)&var1, (__u64)&var2 }
> > into rodata section.
> >
> > But &var1 and &var2 themselves need relocation since they are
> > address of static variables which will sit inside .data section.
> >
> > So in the elf file, you will see the following relocations:
> >
> > RELOCATION RECORDS FOR [.rodata]:
> > OFFSET TYPE VALUE
> > 0000000000000018 R_BPF_64_64 .data
> > 0000000000000020 R_BPF_64_64 .data
Right :) Thank you for the explanations Yonghong!
> > Currently, libbpf does not handle relocation inside .rodata
> > section, so they content remains 0.
Just for my own edification, why is .rodata relocation not yet handled
in libbpf ? Is it because of a read-only mapping that makes it more
difficult ?
> > That is why you see the issue with pointer as NULL.
> >
> > With array size of 1, compiler does not bother to put it into
> > rodata section.
> >
> > I *guess* that it works in the macro due to some kind of heuristics,
> > e.g., nested blocks, etc, and llvm did not promote the array init value
> > to rodata. I will double check whether llvm can complete prevent
> > such transformation.
> >
> > Maybe in the future libbpf is able to handle relocations for
> > rodata section too. But for the time being, please just consider to use
> > either macro, or the explicit array assignment.
>
> Digging into the compiler, the compiler tries to make *const* initial
> value into rodata section if the initial value size > 64, so in
> this case, macro does not work either. I think this is how you
> discovered the issue.
Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how
I found the bug.
> The llvm does not provide target hooks to
> influence this transformation.
Oh, that is unfortunate :) Thanks for looking into it! I feel that the
real fix would be in libbpf anyway and the rest is just workarounds.
> So, there are two workarounds,
> (1). __u64 param_working[2];
> param_working[0] = (__u64)str1;
> param_working[1] = (__u64)str2;
> (2). BPF_SEQ_PRINTF(seq, "%s ", str1);
> BPF_SEQ_PRINTF(seq, "%s", str2);
(2) is a bit impractical for my actual usecase. I am implementing a
bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep
the selftest short with a few BPF_SNPRINTF() calls that exercise most
format specifiers.
> In practice, if you have at least one non-const format argument,
> you should be fine. But if all format arguments are constant, then
> none of them should be strings.
Just for context, this does not only happen for strings but also for
all sorts of pointers, for example, when I try to do address lookup of
global __ksym variables, which is important for my selftest.
> Maybe we could change marco
> unsigned long long ___param[] = { args };
> to declare an array explicitly and then have a loop to
> assign each array element?
I think this would be a good workaround for now, indeed. :) I'll look
into it today and send it as part of my bpf_snprintf series.
Thanks!
Powered by blists - more mailing lists