[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABRcYm+4u3cp3+AbeJEcy9uDvO91YiPeb07-U_qdAmWTogFKbA@mail.gmail.com>
Date: Tue, 26 Jan 2021 19:00:32 +0100
From: Florent Revest <revest@...omium.org>
To: Yonghong Song <yhs@...com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
KP Singh <kpsingh@...nel.org>, bpf <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Florent Revest <revest@...gle.com>,
open list <linux-kernel@...r.kernel.org>,
Brendan Jackman <jackmanb@...omium.org>
Subject: Re: [PATCH bpf-next v5 4/4] selftests/bpf: Add a selftest for the
tracing bpf_get_socket_cookie
On Sat, Jan 23, 2021 at 9:45 PM Yonghong Song <yhs@...com> wrote:
> On 1/22/21 7:34 AM, Florent Revest wrote:
> > On Wed, Jan 20, 2021 at 8:06 PM Florent Revest <revest@...omium.org> wrote:
> >>
> >> On Wed, Jan 20, 2021 at 8:04 PM Alexei Starovoitov
> >> <alexei.starovoitov@...il.com> wrote:
> >>>
> >>> On Wed, Jan 20, 2021 at 9:08 AM KP Singh <kpsingh@...nel.org> wrote:
> >>>>
> >>>> On Tue, Jan 19, 2021 at 5:00 PM Florent Revest <revest@...omium.org> wrote:
> >>>>>
> >>>>> This builds up on the existing socket cookie test which checks whether
> >>>>> the bpf_get_socket_cookie helpers provide the same value in
> >>>>> cgroup/connect6 and sockops programs for a socket created by the
> >>>>> userspace part of the test.
> >>>>>
> >>>>> Adding a tracing program to the existing objects requires a different
> >>>>> attachment strategy and different headers.
> >>>>>
> >>>>> Signed-off-by: Florent Revest <revest@...omium.org>
> >>>>
> >>>> Acked-by: KP Singh <kpsingh@...nel.org>
> >>>>
> >>>> (one minor note, doesn't really need fixing as a part of this though)
> >>>>
> >>>>> ---
> >>>>> .../selftests/bpf/prog_tests/socket_cookie.c | 24 +++++++----
> >>>>> .../selftests/bpf/progs/socket_cookie_prog.c | 41 ++++++++++++++++---
> >>>>> 2 files changed, 52 insertions(+), 13 deletions(-)
> >>>>>
> >>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> >>>>> index 53d0c44e7907..e5c5e2ea1deb 100644
> >>>>> --- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> >>>>> +++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
> >>>>> @@ -15,8 +15,8 @@ struct socket_cookie {
> >>>>>
> >>>>> void test_socket_cookie(void)
> >>>>> {
> >>>>> + struct bpf_link *set_link, *update_sockops_link, *update_tracing_link;
> >>>>> socklen_t addr_len = sizeof(struct sockaddr_in6);
> >>>>> - struct bpf_link *set_link, *update_link;
> >>>>> int server_fd, client_fd, cgroup_fd;
> >>>>> struct socket_cookie_prog *skel;
> >>>>> __u32 cookie_expected_value;
> >>>>> @@ -39,15 +39,21 @@ void test_socket_cookie(void)
> >>>>> PTR_ERR(set_link)))
> >>>>> goto close_cgroup_fd;
> >>>>>
> >>>>> - update_link = bpf_program__attach_cgroup(skel->progs.update_cookie,
> >>>>> - cgroup_fd);
> >>>>> - if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err %ld\n",
> >>>>> - PTR_ERR(update_link)))
> >>>>> + update_sockops_link = bpf_program__attach_cgroup(
> >>>>> + skel->progs.update_cookie_sockops, cgroup_fd);
> >>>>> + if (CHECK(IS_ERR(update_sockops_link), "update-sockops-link-cg-attach",
> >>>>> + "err %ld\n", PTR_ERR(update_sockops_link)))
> >>>>> goto free_set_link;
> >>>>>
> >>>>> + update_tracing_link = bpf_program__attach(
> >>>>> + skel->progs.update_cookie_tracing);
> >>>>> + if (CHECK(IS_ERR(update_tracing_link), "update-tracing-link-attach",
> >>>>> + "err %ld\n", PTR_ERR(update_tracing_link)))
> >>>>> + goto free_update_sockops_link;
> >>>>> +
> >>>>> server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
> >>>>> if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
> >>>>> - goto free_update_link;
> >>>>> + goto free_update_tracing_link;
> >>>>>
> >>>>> client_fd = connect_to_fd(server_fd, 0);
> >>>>> if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
> >>>>> @@ -71,8 +77,10 @@ void test_socket_cookie(void)
> >>>>> close(client_fd);
> >>>>> close_server_fd:
> >>>>> close(server_fd);
> >>>>> -free_update_link:
> >>>>> - bpf_link__destroy(update_link);
> >>>>> +free_update_tracing_link:
> >>>>> + bpf_link__destroy(update_tracing_link);
> >>>>
> >>>> I don't think this need to block submission unless there are other
> >>>> issues but the
> >>>> bpf_link__destroy can just be called in a single cleanup label because
> >>>> it handles null or
> >>>> erroneous inputs:
> >>>>
> >>>> int bpf_link__destroy(struct bpf_link *link)
> >>>> {
> >>>> int err = 0;
> >>>>
> >>>> if (IS_ERR_OR_NULL(link))
> >>>> return 0;
> >>>> [...]
> >>>
> >>> +1 to KP's point.
> >>>
> >>> Also Florent, how did you test it?
> >>> This test fails in CI and in my manual run:
> >>> ./test_progs -t cook
> >>> libbpf: load bpf program failed: Permission denied
> >>> libbpf: -- BEGIN DUMP LOG ---
> >>> libbpf:
> >>> ; int update_cookie_sockops(struct bpf_sock_ops *ctx)
> >>> 0: (bf) r6 = r1
> >>> ; if (ctx->family != AF_INET6)
> >>> 1: (61) r1 = *(u32 *)(r6 +20)
> >>> ; if (ctx->family != AF_INET6)
> >>> 2: (56) if w1 != 0xa goto pc+21
> >>> R1_w=inv10 R6_w=ctx(id=0,off=0,imm=0) R10=fp0
> >>> ; if (ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
> >>> 3: (61) r1 = *(u32 *)(r6 +0)
> >>> ; if (ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
> >>> 4: (56) if w1 != 0x3 goto pc+19
> >>> R1_w=inv3 R6_w=ctx(id=0,off=0,imm=0) R10=fp0
> >>> ; if (!ctx->sk)
> >>> 5: (79) r1 = *(u64 *)(r6 +184)
> >>> ; if (!ctx->sk)
> >>> 6: (15) if r1 == 0x0 goto pc+17
> >>> R1_w=sock(id=0,ref_obj_id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) R10=fp0
> >>> ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
> >>> 7: (79) r2 = *(u64 *)(r6 +184)
> >>> ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
> >>> 8: (18) r1 = 0xffff888106e41400
> >>> 10: (b7) r3 = 0
> >>> 11: (b7) r4 = 0
> >>> 12: (85) call bpf_sk_storage_get#107
> >>> R2 type=sock_or_null expected=sock_common, sock, tcp_sock, xdp_sock, ptr_
> >>> processed 12 insns (limit 1000000) max_states_per_insn 0 total_states
> >>> 0 peak_states 0 mark_read 0
> >>>
> >>> libbpf: -- END LOG --
> >>> libbpf: failed to load program 'update_cookie_sockops'
> >>> libbpf: failed to load object 'socket_cookie_prog'
> >>> libbpf: failed to load BPF skeleton 'socket_cookie_prog': -4007
> >>> test_socket_cookie:FAIL:socket_cookie_prog__open_and_load skeleton
> >>> open_and_load failed
> >>> #95 socket_cookie:FAIL
> >>> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> >>
> >> Oh :| I must have missed something in the rebase, I will fix this and
> >> address KP's comment then. Thanks for the review and sorry for the
> >> waste of time :)
> >
> > So this is actually an interesting one I think. :) The failure was
> > triggered by the combination of an LLVM update and this change:
> >
> > -#include <linux/bpf.h>
> > +#include "vmlinux.h"
> >
> > With an older LLVM, this used to work.
> > With a recent LLVM, the change of header causes those 3 lines to get
> > compiled differently:
> >
> > if (!ctx->sk)
> > return 1;
> > p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
> >
> > When including linux/bpf.h
> > ; if (!ctx->sk)
> > 5: 79 62 b8 00 00 00 00 00 r2 = *(u64 *)(r6 + 184)
> > 6: 15 02 10 00 00 00 00 00 if r2 == 0 goto +16 <LBB1_6>
> > ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
> > 7: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
> > 9: b7 03 00 00 00 00 00 00 r3 = 0
> > 10: b7 04 00 00 00 00 00 00 r4 = 0
> > 11: 85 00 00 00 6b 00 00 00 call 107
> > 12: bf 07 00 00 00 00 00 00 r7 = r0
> >
> > When including vmlinux.h
> > ; if (!ctx->sk)
> > 5: 79 61 b8 00 00 00 00 00 r1 = *(u64 *)(r6 + 184)
> > 6: 15 01 11 00 00 00 00 00 if r1 == 0 goto +17 <LBB1_6>
> > ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
> > 7: 79 62 b8 00 00 00 00 00 r2 = *(u64 *)(r6 + 184)
> > 8: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
> > 10: b7 03 00 00 00 00 00 00 r3 = 0
> > 11: b7 04 00 00 00 00 00 00 r4 = 0
> > 12: 85 00 00 00 6b 00 00 00 call 107
> > 13: bf 07 00 00 00 00 00 00 r7 = r0
> >
> > Note that ctx->sk gets fetched once in the first case (l5), and twice
> > in the second case (l5 and l7).
> > I'm assuming that struct bpf_sock_ops gets defined with different
> > attributes in vmlinux.h and causes LLVM to assume that ctx->sk could
> > have changed between the time of check and the time of use so it
> > yields two fetches and the verifier can't track that r2 is non null.
> >
> > If I save ctx->sk in a temporary variable first:
> >
> > struct bpf_sock *sk = ctx->sk;
> > if (!sk)
> > return 1;
> > p = bpf_sk_storage_get(&socket_cookies, sk, 0, 0);
> >
> > this loads correctly. However, Brendan pointed out that this is also a
> > weak guarantee and that LLVM could still decide to compile this into
> > two fetches, Brendan suggested that we save sk out of an access to a
> > volatile pointer to ctx, what do you think ?
>
> Your above workaround should be good. Compiler should not do *bad*
> optimizations to have two fetches if the source code just has one
> in the above case. If this happens, it will be a llvm bug.
>
> The latest llvm trunk can reproduce the above issue. It is due to
> (1). llvm's partial (not complete) CSE and (2). this partial CSE
> resulted in llvm BPF backend generating two CORE_MEM operations (for
> CORE relocations) instead of one. If llvm will do complete cse like the
> above your code, we will be fine.
>
> Although fixing llvm CSE is desired, it may take
> some time. At the same time, please use your above source workaround.
> Thanks.
Good to know! Thank you Yonghong :)
Powered by blists - more mailing lists