lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 23 Jan 2021 12:45:45 -0800
From:   Yonghong Song <yhs@...com>
To:     Florent Revest <revest@...omium.org>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:     KP Singh <kpsingh@...nel.org>, bpf <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Florent Revest <revest@...gle.com>,
        open list <linux-kernel@...r.kernel.org>,
        Brendan Jackman <jackmanb@...omium.org>
Subject: Re: [PATCH bpf-next v5 4/4] selftests/bpf: Add a selftest for the
 tracing bpf_get_socket_cookie



On 1/22/21 7:34 AM, Florent Revest wrote:
> On Wed, Jan 20, 2021 at 8:06 PM Florent Revest <revest@...omium.org> wrote:
>>
>> On Wed, Jan 20, 2021 at 8:04 PM Alexei Starovoitov
>> <alexei.starovoitov@...il.com> wrote:
>>>
>>> On Wed, Jan 20, 2021 at 9:08 AM KP Singh <kpsingh@...nel.org> wrote:
>>>>
>>>> On Tue, Jan 19, 2021 at 5:00 PM Florent Revest <revest@...omium.org> wrote:
>>>>>
>>>>> This builds up on the existing socket cookie test which checks whether
>>>>> the bpf_get_socket_cookie helpers provide the same value in
>>>>> cgroup/connect6 and sockops programs for a socket created by the
>>>>> userspace part of the test.
>>>>>
>>>>> Adding a tracing program to the existing objects requires a different
>>>>> attachment strategy and different headers.
>>>>>
>>>>> Signed-off-by: Florent Revest <revest@...omium.org>
>>>>
>>>> Acked-by: KP Singh <kpsingh@...nel.org>
>>>>
>>>> (one minor note, doesn't really need fixing as a part of this though)
>>>>
>>>>> ---
>>>>>   .../selftests/bpf/prog_tests/socket_cookie.c  | 24 +++++++----
>>>>>   .../selftests/bpf/progs/socket_cookie_prog.c  | 41 ++++++++++++++++---
>>>>>   2 files changed, 52 insertions(+), 13 deletions(-)
>>>>>
>>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
>>>>> index 53d0c44e7907..e5c5e2ea1deb 100644
>>>>> --- a/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
>>>>> +++ b/tools/testing/selftests/bpf/prog_tests/socket_cookie.c
>>>>> @@ -15,8 +15,8 @@ struct socket_cookie {
>>>>>
>>>>>   void test_socket_cookie(void)
>>>>>   {
>>>>> +       struct bpf_link *set_link, *update_sockops_link, *update_tracing_link;
>>>>>          socklen_t addr_len = sizeof(struct sockaddr_in6);
>>>>> -       struct bpf_link *set_link, *update_link;
>>>>>          int server_fd, client_fd, cgroup_fd;
>>>>>          struct socket_cookie_prog *skel;
>>>>>          __u32 cookie_expected_value;
>>>>> @@ -39,15 +39,21 @@ void test_socket_cookie(void)
>>>>>                    PTR_ERR(set_link)))
>>>>>                  goto close_cgroup_fd;
>>>>>
>>>>> -       update_link = bpf_program__attach_cgroup(skel->progs.update_cookie,
>>>>> -                                                cgroup_fd);
>>>>> -       if (CHECK(IS_ERR(update_link), "update-link-cg-attach", "err %ld\n",
>>>>> -                 PTR_ERR(update_link)))
>>>>> +       update_sockops_link = bpf_program__attach_cgroup(
>>>>> +               skel->progs.update_cookie_sockops, cgroup_fd);
>>>>> +       if (CHECK(IS_ERR(update_sockops_link), "update-sockops-link-cg-attach",
>>>>> +                 "err %ld\n", PTR_ERR(update_sockops_link)))
>>>>>                  goto free_set_link;
>>>>>
>>>>> +       update_tracing_link = bpf_program__attach(
>>>>> +               skel->progs.update_cookie_tracing);
>>>>> +       if (CHECK(IS_ERR(update_tracing_link), "update-tracing-link-attach",
>>>>> +                 "err %ld\n", PTR_ERR(update_tracing_link)))
>>>>> +               goto free_update_sockops_link;
>>>>> +
>>>>>          server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
>>>>>          if (CHECK(server_fd < 0, "start_server", "errno %d\n", errno))
>>>>> -               goto free_update_link;
>>>>> +               goto free_update_tracing_link;
>>>>>
>>>>>          client_fd = connect_to_fd(server_fd, 0);
>>>>>          if (CHECK(client_fd < 0, "connect_to_fd", "errno %d\n", errno))
>>>>> @@ -71,8 +77,10 @@ void test_socket_cookie(void)
>>>>>          close(client_fd);
>>>>>   close_server_fd:
>>>>>          close(server_fd);
>>>>> -free_update_link:
>>>>> -       bpf_link__destroy(update_link);
>>>>> +free_update_tracing_link:
>>>>> +       bpf_link__destroy(update_tracing_link);
>>>>
>>>> I don't think this need to block submission unless there are other
>>>> issues but the
>>>> bpf_link__destroy can just be called in a single cleanup label because
>>>> it handles null or
>>>> erroneous inputs:
>>>>
>>>> int bpf_link__destroy(struct bpf_link *link)
>>>> {
>>>>      int err = 0;
>>>>
>>>>      if (IS_ERR_OR_NULL(link))
>>>>           return 0;
>>>> [...]
>>>
>>> +1 to KP's point.
>>>
>>> Also Florent, how did you test it?
>>> This test fails in CI and in my manual run:
>>> ./test_progs -t cook
>>> libbpf: load bpf program failed: Permission denied
>>> libbpf: -- BEGIN DUMP LOG ---
>>> libbpf:
>>> ; int update_cookie_sockops(struct bpf_sock_ops *ctx)
>>> 0: (bf) r6 = r1
>>> ; if (ctx->family != AF_INET6)
>>> 1: (61) r1 = *(u32 *)(r6 +20)
>>> ; if (ctx->family != AF_INET6)
>>> 2: (56) if w1 != 0xa goto pc+21
>>>   R1_w=inv10 R6_w=ctx(id=0,off=0,imm=0) R10=fp0
>>> ; if (ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
>>> 3: (61) r1 = *(u32 *)(r6 +0)
>>> ; if (ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
>>> 4: (56) if w1 != 0x3 goto pc+19
>>>   R1_w=inv3 R6_w=ctx(id=0,off=0,imm=0) R10=fp0
>>> ; if (!ctx->sk)
>>> 5: (79) r1 = *(u64 *)(r6 +184)
>>> ; if (!ctx->sk)
>>> 6: (15) if r1 == 0x0 goto pc+17
>>>   R1_w=sock(id=0,ref_obj_id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) R10=fp0
>>> ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
>>> 7: (79) r2 = *(u64 *)(r6 +184)
>>> ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
>>> 8: (18) r1 = 0xffff888106e41400
>>> 10: (b7) r3 = 0
>>> 11: (b7) r4 = 0
>>> 12: (85) call bpf_sk_storage_get#107
>>> R2 type=sock_or_null expected=sock_common, sock, tcp_sock, xdp_sock, ptr_
>>> processed 12 insns (limit 1000000) max_states_per_insn 0 total_states
>>> 0 peak_states 0 mark_read 0
>>>
>>> libbpf: -- END LOG --
>>> libbpf: failed to load program 'update_cookie_sockops'
>>> libbpf: failed to load object 'socket_cookie_prog'
>>> libbpf: failed to load BPF skeleton 'socket_cookie_prog': -4007
>>> test_socket_cookie:FAIL:socket_cookie_prog__open_and_load skeleton
>>> open_and_load failed
>>> #95 socket_cookie:FAIL
>>> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
>>
>> Oh :| I must have missed something in the rebase, I will fix this and
>> address KP's comment then. Thanks for the review and sorry for the
>> waste of time :)
> 
> So this is actually an interesting one I think. :) The failure was
> triggered by the combination of an LLVM update and this change:
> 
> -#include <linux/bpf.h>
> +#include "vmlinux.h"
> 
> With an older LLVM, this used to work.
> With a recent LLVM, the change of header causes those 3 lines to get
> compiled differently:
> 
> if (!ctx->sk)
>      return 1;
> p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
> 
> When including linux/bpf.h
> ; if (!ctx->sk)
>         5: 79 62 b8 00 00 00 00 00 r2 = *(u64 *)(r6 + 184)
>         6: 15 02 10 00 00 00 00 00 if r2 == 0 goto +16 <LBB1_6>
> ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
>         7: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
>         9: b7 03 00 00 00 00 00 00 r3 = 0
>        10: b7 04 00 00 00 00 00 00 r4 = 0
>        11: 85 00 00 00 6b 00 00 00 call 107
>        12: bf 07 00 00 00 00 00 00 r7 = r0
> 
> When including vmlinux.h
> ; if (!ctx->sk)
>         5: 79 61 b8 00 00 00 00 00 r1 = *(u64 *)(r6 + 184)
>         6: 15 01 11 00 00 00 00 00 if r1 == 0 goto +17 <LBB1_6>
> ; p = bpf_sk_storage_get(&socket_cookies, ctx->sk, 0, 0);
>         7: 79 62 b8 00 00 00 00 00 r2 = *(u64 *)(r6 + 184)
>         8: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
>        10: b7 03 00 00 00 00 00 00 r3 = 0
>        11: b7 04 00 00 00 00 00 00 r4 = 0
>        12: 85 00 00 00 6b 00 00 00 call 107
>        13: bf 07 00 00 00 00 00 00 r7 = r0
> 
> Note that ctx->sk gets fetched once in the first case (l5), and twice
> in the second case (l5 and l7).
> I'm assuming that struct bpf_sock_ops gets defined with different
> attributes in vmlinux.h and causes LLVM to assume that ctx->sk could
> have changed between the time of check and the time of use so it
> yields two fetches and the verifier can't track that r2 is non null.
> 
> If I save ctx->sk in a temporary variable first:
> 
> struct bpf_sock *sk = ctx->sk;
> if (!sk)
>      return 1;
> p = bpf_sk_storage_get(&socket_cookies, sk, 0, 0);
> 
> this loads correctly. However, Brendan pointed out that this is also a
> weak guarantee and that LLVM could still decide to compile this into
> two fetches, Brendan suggested that we save sk out of an access to a
> volatile pointer to ctx, what do you think ?

Your above workaround should be good. Compiler should not do *bad* 
optimizations to have two fetches if the source code just has one
in the above case. If this happens, it will be a llvm bug.

The latest llvm trunk can reproduce the above issue. It is due to
(1). llvm's partial (not complete) CSE and (2). this partial CSE
resulted in llvm BPF backend generating two CORE_MEM operations (for 
CORE relocations) instead of one. If llvm will do complete cse like the 
above your code, we will be fine.

Although fixing llvm CSE is desired, it may take
some time. At the same time, please use your above source workaround.
Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ