[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6e28c1c3ef0eda6f041593216ac32b210e55e4b7.camel@linux.ibm.com>
Date: Tue, 22 Feb 2022 18:42:26 +0100
From: Ilya Leoshkevich <iii@...ux.ibm.com>
To: Jakub Sitnicki <jakub@...udflare.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
kernel-team@...udflare.com,
Andrii Nakryiko <andrii.nakryiko@...il.com>
Subject: Re: [PATCH bpf-next] selftests/bpf: Fix implementation-defined
behavior in sk_lookup test
On Tue, 2022-02-22 at 15:53 +0100, Jakub Sitnicki wrote:
> On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote:
> > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote:
> > > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote:
> > > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote:
> > > > > Shifting 16-bit type by 16 bits is implementation-defined for
> > > > > BPF
> > > > > programs.
> > > > > Don't rely on it in case it is causing the test failures we
> > > > > are
> > > > > seeing on
> > > > > s390x z15 target.
> > > > >
> > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from
> > > > > remote_port in bpf_sk_lookup")
> > > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@...il.com>
> > > > > Signed-off-by: Jakub Sitnicki <jakub@...udflare.com>
> > > > > ---
> > > > >
> > > > > I don't have a dev env for s390x/z15 set up yet, so can't
> > > > > definitely
> > > > > confirm the fix.
> > > > > That said, it seems worth fixing either way.
> > > > >
> > > > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++-
> > > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git
> > > > > a/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > index bf5b7caefdd0..7d47276a8964 100644
> > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A;
> > > > > static const __u32 KEY_SERVER_B = SERVER_B;
> > > > >
> > > > > static const __u16 SRC_PORT = bpf_htons(8008);
> > > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16);
> > > > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2);
> > > > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0,
> > > > > 0x00000002);
> > > > >
> > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct
> > > > > bpf_sk_lookup
> > > > > *ctx)
> > > > >
> > > > > /* Load from remote_port field with zero padding
> > > > > (backward
> > > > > compatibility) */
> > > > > val_u32 = *(__u32 *)&ctx->remote_port;
> > > > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16))
> > > > > + if (val_u32 != SRC_PORT_U32)
> > > > > return SK_DROP;
> > > > >
> > > > > /* Narrow loads from local_port field. Expect
> > > > > DST_PORT.
> > > > > */
> > > >
> > > > Unfortunately this doesn't help with the s390 problem.
> > > > I'll try to debug this.
> > >
> > > I have to admit I have a hard time wrapping my head around the
> > > requirements here.
> > >
> > > Based on the pre-9a69e2b385f4 code, do I understand correctly
> > > that
> > > for the following input
> > >
> > > Port: 0x1f48
> > > SRC_PORT: 0x481f
> > >
> > > we expect the following results for different kinds of loads:
> > >
> > > Size Offset LE BE
> > > BPF_B 0 0x1f 0
> > > BPF_B 1 0x48 0
> > > BPF_B 2 0 0x48
> > > BPF_B 3 0 0x1f
> > > BPF_H 0 0x481f 0
> > > BPF_H 1 0 0x481f
> > > BPF_W 0 0x481f 0x481f
> > >
> > > and this is guaranteed by the struct bpf_sk_lookup ABI? Because
> > > then
> > > it
> > > looks as if 9a69e2b385f4 breaks it on big-endian as follows:
> > >
> > > Size Offset BE-9a69e2b385f4
> > > BPF_B 0 0x48
> > > BPF_B 1 0x1f
> > > BPF_B 2 0
> > > BPF_B 3 0
> > > BPF_H 0 0x481f
> > > BPF_H 1 0
> > > BPF_W 0 0x481f0000
> >
> > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change the
> > kernel behavior, the ABI is not broken and the old compiled code
> > should
> > continue to work.
> > What the second table really shows are what the results should be
> > according to the 9a69e2b385f4 struct bpf_sk_lookup definition,
> > which I
> > still think is broken on big-endian and needs to be adjusted to
> > match
> > the ABI.
> >
> > I noticed one other strange thing in the meantime: loads from
> > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and
> > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, which
> > is
> > clearly inconsistent. It looks as if convert_ctx_accesses() needs
> > to be
> > adjusted to handle combinations like ctx_field_size == 4 && size ==
> > 2
> > && target_size == 2. I will continue with this tomorrow.
> >
> > > Or is the old behavior a bug and this new one is desirable?
> > > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former :-(
> > >
> > > In which case, would it make sense to fix it by swapping
> > > remote_port
> > > and :16 in bpf_sk_lookup on big-endian?
>
> Thanks for looking into it.
>
> When it comes to requirements, my intention was to keep the same
> behavior as before the split up of the remote_port field in
> 9a69e2b385f4
> ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide").
>
> 9a69e2b385f4 was supposed to be a formality, after a similar change
> in
> 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit
> wide"), which went in earlier.
>
> In 4421a582718a I've provided a bit more context. I understand that
> the
> remote_port value, even before the type changed from u32 to u16,
> appeared to the BPF program as if laid out in memory like so:
>
> offsetof(struct bpf_sk_lookup, remote_port) +0 <port MSB>
> +1 <port LSB>
> +2 0x00
> +3 0x00
>
> Translating it to your handy table format, I expect should result in
> loads as so if port is 8008 = 0x1f48:
>
> Size Offset LE BE
> BPF_B 0 0x1f 0x1f
> BPF_B 1 0x48 0x48
> BPF_B 2 0 0
> BPF_B 3 0 0
> BPF_H 0 0x481f 0x1f48
> BPF_H 1 0 0
> BPF_W 0 0x481f 0x1f480000
Hmm, I think for big-endian the layout is different.
If we look at test_sk_lookup.c from 9a69e2b385f4^:
/* Narrow loads from remote_port field. Expect SRC_PORT. */
if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) ||
LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) ||
LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3)
!= 0)
return SK_DROP;
LSB() on little-endian is just byte indexing, so it's indeed
1f,48,00,00. However, on big-endian it's indexing from the end, so
it's 00,00,48,1f.
> But since the fix does not work, there must be a mistake somewhere in
> my
> reasoning.
>
> I expect I should be able to get virtme for s390 working sometime
> this
> week to check my math. I've seen your collegue had some luck with it
> [1].
Yeah, I think it should work. In the worst case it should be possible
to tweak vmtest.sh to cross-compile and emulate s390.
> Looking forward to your findings.
>
> [1] https://github.com/cilium/ebpf/issues/86#issuecomment-623945549
Powered by blists - more mailing lists