netdev - Re: [bpf-next PATCH 2/4] selftests/bpf: Drop python client/server in favor of threads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201029181258.ezff3vfpar7fxbam@kafai-mbp.dhcp.thefacebook.com>
Date:   Thu, 29 Oct 2020 11:12:58 -0700
From:   Martin KaFai Lau <kafai@...com>
To:     Alexander Duyck <alexander.duyck@...il.com>
CC:     <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        Kernel Team <kernel-team@...com>,
        Netdev <netdev@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>, <brakmo@...com>,
        <alexanderduyck@...com>
Subject: Re: [bpf-next PATCH 2/4] selftests/bpf: Drop python client/server in
 favor of threads

On Thu, Oct 29, 2020 at 09:58:15AM -0700, Alexander Duyck wrote:
[ ... ]

> > > @@ -43,7 +94,9 @@ int verify_result(const struct tcpbpf_globals *result)
> > >       EXPECT_EQ(0x80, result->bad_cb_test_rv, PRIu32);
> > >       EXPECT_EQ(0, result->good_cb_test_rv, PRIu32);
> > >       EXPECT_EQ(1, result->num_listen, PRIu32);
> > > -     EXPECT_EQ(EXPECTED_CLOSE_EVENTS, result->num_close_events, PRIu32);
> > > +
> > > +     /* 3 comes from one listening socket + both ends of the connection */
> > > +     EXPECT_EQ(3, result->num_close_events, PRIu32);
> > >
> > >       return ret;
> > >  }
> > > @@ -67,6 +120,52 @@ int verify_sockopt_result(int sock_map_fd)
> > >       return ret;
> > >  }
> > >
> > > +static int run_test(void)
> > > +{
> > > +     int server_fd, client_fd;
> > > +     void *server_err;
> > > +     char buf[1000];
> > > +     pthread_t tid;
> > > +     int err = -1;
> > > +     int i;
> > > +
> > > +     server_fd = start_server(AF_INET6, SOCK_STREAM, LO_ADDR6, 0, 0);
> > > +     if (CHECK_FAIL(server_fd < 0))
> > > +             return err;
> > > +
> > > +     pthread_mutex_lock(&server_started_mtx);
> > > +     if (CHECK_FAIL(pthread_create(&tid, NULL, server_thread,
> > > +                                   (void *)&server_fd)))
> > > +             goto close_server_fd;
> > > +
> > > +     pthread_cond_wait(&server_started, &server_started_mtx);
> > > +     pthread_mutex_unlock(&server_started_mtx);
> > > +
> > > +     client_fd = connect_to_fd(server_fd, 0);
> > > +     if (client_fd < 0)
> > > +             goto close_server_fd;
> > > +
> > > +     for (i = 0; i < 1000; i++)
> > > +             buf[i] = '+';
> > > +
> > > +     if (CHECK_FAIL(send(client_fd, buf, 1000, 0) < 1000))
> > > +             goto close_client_fd;
> > > +
> > > +     if (CHECK_FAIL(recv(client_fd, buf, 500, 0) < 500))
> > > +             goto close_client_fd;
> > > +
> > > +     pthread_join(tid, &server_err);
> > I think this can be further simplified without starting thread
> > and do everything in run_test() instead.
> >
> > Something like this (uncompiled code):
> >
> >         accept_fd = accept(server_fd, NULL, 0);
> >         send(client_fd, plus_buf, 1000, 0);
> >         recv(accept_fd, recv_buf, 1000, 0);
> >         send(accept_fd, dot_buf, 500, 0);
> >         recv(client_fd, recv_buf, 500, 0);
> 
> I can take a look at switching it over.
> 
> > > +
> > > +     err = (int)(long)server_err;
> > > +     CHECK_FAIL(err);
> > > +
> > > +close_client_fd:
> > > +     close(client_fd);
> > > +close_server_fd:
> > > +     close(server_fd);
> > > +     return err;
> > > +}
> > > +
> > >  void test_tcpbpf_user(void)
> > >  {
> > >       const char *file = "test_tcpbpf_kern.o";
> > > @@ -74,7 +173,6 @@ void test_tcpbpf_user(void)
> > >       struct tcpbpf_globals g = {0};
> > >       struct bpf_object *obj;
> > >       int cg_fd = -1;
> > > -     int retry = 10;
> > >       __u32 key = 0;
> > >       int rv;
> > >
> > > @@ -94,11 +192,6 @@ void test_tcpbpf_user(void)
> > >               goto err;
> > >       }
> > >
> > > -     if (CHECK_FAIL(system("./tcp_server.py"))) {
> > > -             fprintf(stderr, "FAILED: TCP server\n");
> > > -             goto err;
> > > -     }
> > > -
> > >       map_fd = bpf_find_map(__func__, obj, "global_map");
> > >       if (CHECK_FAIL(map_fd < 0))
> > >               goto err;
> > > @@ -107,21 +200,17 @@ void test_tcpbpf_user(void)
> > >       if (CHECK_FAIL(sock_map_fd < 0))
> > >               goto err;
> > >
> > > -retry_lookup:
> > > +     if (run_test()) {
> > > +             fprintf(stderr, "FAILED: TCP server\n");
> > > +             goto err;
> > > +     }
> > > +
> > >       rv = bpf_map_lookup_elem(map_fd, &key, &g);
> > >       if (CHECK_FAIL(rv != 0)) {
> > CHECK() is a better one here if it needs to output error message.
> > The same goes for similar usages in this patch set.
> >
> > For the start_server() above which has already logged the error message,
> > CHECK_FAIL() is good enough.
> >
> > >               fprintf(stderr, "FAILED: bpf_map_lookup_elem returns %d\n", rv);
> > >               goto err;
> > >       }
> > >
> > > -     if (g.num_close_events != EXPECTED_CLOSE_EVENTS && retry--) {
> > It is good to have a solution to avoid a test depending on some number
> > of retries.
> >
> > After looking at BPF_SOCK_OPS_STATE_CB in test_tcpbpf_kern.c,
> > it is not clear to me removing python alone is enough to avoid the
> > race (so the retry--).  One of the sk might still be in TCP_LAST_ACK
> > instead of TCP_CLOSE.
> >
> 
> After you pointed this out I decided to go back through and do some
> further testing. After testing this for several thousand iterations it
> does look like the issue can still happen, it was just significantly
> less frequent with the threaded approach, but it was still there. So I
> will go back through and add this back and then fold it into the
> verify_results function in the third patch. Although I might reduce
> the wait times as it seems like with the inline approach we only need
> in the 10s of microseconds instead of 100s for the sockets to close
> out.
I think this retry-and-wait can be avoided.  More on this...

> 
> > Also, when looking closer at BPF_SOCK_OPS_STATE_CB in test_tcpbpf_kern.c,
> > it seems the map value "gp" is slapped together across multiple
> > TCP_CLOSE events which may be not easy to understand.
> >
> > How about it checks different states: TCP_CLOSE, TCP_LAST_ACK,
> > and BPF_TCP_FIN_WAIT2.  Each of this state will update its own
> > values under "gp".  Something like this (only compiler tested on
> > top of patch 4):
> >
> > diff --git i/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c w/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c
> > index 7e92c37976ac..65b247b03dfc 100644
> > --- i/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c
> > +++ w/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c
> > @@ -90,15 +90,14 @@ static void verify_result(int map_fd, int sock_map_fd)
> >               result.event_map, expected_events);
> >
> >         ASSERT_EQ(result.bytes_received, 501, "bytes_received");
> > -       ASSERT_EQ(result.bytes_acked, 1002, "bytes_acked");
> > +       ASSERT_EQ(result.bytes_acked, 1001, "bytes_acked");
> >         ASSERT_EQ(result.data_segs_in, 1, "data_segs_in");
> >         ASSERT_EQ(result.data_segs_out, 1, "data_segs_out");
> >         ASSERT_EQ(result.bad_cb_test_rv, 0x80, "bad_cb_test_rv");
> >         ASSERT_EQ(result.good_cb_test_rv, 0, "good_cb_test_rv");
> > -       ASSERT_EQ(result.num_listen, 1, "num_listen");
> > -
> > -       /* 3 comes from one listening socket + both ends of the connection */
> > -       ASSERT_EQ(result.num_close_events, 3, "num_close_events");
> > +       ASSERT_EQ(result.num_listen_close, 1, "num_listen");
> > +       ASSERT_EQ(result.num_last_ack, 1, "num_last_ack");
> > +       ASSERT_EQ(result.num_fin_wait2, 1, "num_fin_wait2");
> >
> >         /* check setsockopt for SAVE_SYN */
> >         key = 0;
> > diff --git i/tools/testing/selftests/bpf/progs/test_tcpbpf_kern.c w/tools/testing/selftests/bpf/progs/test_tcpbpf_kern.c
> > index 3e6912e4df3d..2c5ffb50d6e0 100644
> > --- i/tools/testing/selftests/bpf/progs/test_tcpbpf_kern.c
> > +++ w/tools/testing/selftests/bpf/progs/test_tcpbpf_kern.c
> > @@ -55,9 +55,11 @@ int bpf_testcb(struct bpf_sock_ops *skops)
> >  {
> >         char header[sizeof(struct ipv6hdr) + sizeof(struct tcphdr)];
> >         struct bpf_sock_ops *reuse = skops;
> > +       struct tcpbpf_globals *gp;
> >         struct tcphdr *thdr;
> >         int good_call_rv = 0;
> >         int bad_call_rv = 0;
> > +       __u32 key_zero = 0;
> >         int save_syn = 1;
> >         int rv = -1;
> >         int v = 0;
> > @@ -155,26 +157,21 @@ int bpf_testcb(struct bpf_sock_ops *skops)
> >         case BPF_SOCK_OPS_RETRANS_CB:
> >                 break;
> >         case BPF_SOCK_OPS_STATE_CB:
> > -               if (skops->args[1] == BPF_TCP_CLOSE) {
> > -                       __u32 key = 0;
> > -                       struct tcpbpf_globals g, *gp;
> > -
> > -                       gp = bpf_map_lookup_elem(&global_map, &key);
> > -                       if (!gp)
> > -                               break;
> > -                       g = *gp;
> > -                       if (skops->args[0] == BPF_TCP_LISTEN) {
> > -                               g.num_listen++;
> > -                       } else {
> > -                               g.total_retrans = skops->total_retrans;
> > -                               g.data_segs_in = skops->data_segs_in;
> > -                               g.data_segs_out = skops->data_segs_out;
> > -                               g.bytes_received = skops->bytes_received;
> > -                               g.bytes_acked = skops->bytes_acked;
> > -                       }
> > -                       g.num_close_events++;
> > -                       bpf_map_update_elem(&global_map, &key, &g,
> > -                                           BPF_ANY);
> > +               gp = bpf_map_lookup_elem(&global_map, &key_zero);
> > +               if (!gp)
> > +                       break;
> > +               if (skops->args[1] == BPF_TCP_CLOSE &&
> > +                   skops->args[0] == BPF_TCP_LISTEN) {
> > +                       gp->num_listen_close++;
> > +               } else if (skops->args[1] == BPF_TCP_LAST_ACK) {
> > +                       gp->total_retrans = skops->total_retrans;
> > +                       gp->data_segs_in = skops->data_segs_in;
> > +                       gp->data_segs_out = skops->data_segs_out;
> > +                       gp->bytes_received = skops->bytes_received;
> > +                       gp->bytes_acked = skops->bytes_acked;
> > +                       gp->num_last_ack++;
> > +               } else if (skops->args[1] == BPF_TCP_FIN_WAIT2) {
> > +                       gp->num_fin_wait2++;
I meant with the above change in "case BPF_SOCK_OPS_STATE_CB".
The retry-and-wait in tcpbpf_user.c can be avoided.

What may still be needed in tcpbpf_user.c is to use shutdown and
read-zero to ensure the sk has gone through those states before
calling verify_result().  Something like this [ uncompiled code again :) ]:

        /* Always send FIN from accept_fd first to
         * ensure it will go through FIN_WAIT_2.
         */
        shutdown(accept_fd, SHUT_WR);
        /* Ensure client_fd gets the FIN */
        err = read(client_fd, buf, sizeof(buf));
        if (CHECK(err != 0, "read-after-shutdown(client_fd):",
                  "err:%d errno:%d\n", err, errno))
                goto close_accept_fd;

        /* FIN sends from client_fd and it must be in LAST_ACK now */
        shutdown(client_fd, SHUT_WR);
        /* Ensure accept_fd gets the FIN-ACK.
         * accept_fd must have passed the FIN_WAIT2.
         */
        err = read(accept_fd, buf, sizeof(buf));
        if (CHECK(err != 0, "read-after-shutdown(accept_fd):",
                  "err:%d errno:%d\n", err, errno))
                goto close_accept_fd;

	close(server_fd);
	close(accept_fd);
	close(client_fd);

	/* All sk has gone through the states being tested.
	 * check the results now.
	 */
	verify_result(map_fd, sock_map_fd);

> >                 }
> >                 break;
> >         case BPF_SOCK_OPS_TCP_LISTEN_CB:
> > diff --git i/tools/testing/selftests/bpf/test_tcpbpf.h w/tools/testing/selftests/bpf/test_tcpbpf.h
> > index 6220b95cbd02..0dec324ba4a6 100644
> > --- i/tools/testing/selftests/bpf/test_tcpbpf.h
> > +++ w/tools/testing/selftests/bpf/test_tcpbpf.h
> > @@ -12,7 +12,8 @@ struct tcpbpf_globals {
> >         __u32 good_cb_test_rv;
> >         __u64 bytes_received;
> >         __u64 bytes_acked;
> > -       __u32 num_listen;
> > -       __u32 num_close_events;
> > +       __u32 num_listen_close;
> > +       __u32 num_last_ack;
> > +       __u32 num_fin_wait2;
> >  };
> >  #endif
> 
> I can look at pulling this in and including it as a patch 5 if you
> would prefer. If I find any issues I will let you know.
> 
> > I also noticed the bytes_received/acked depends on the order of close(),
> > i.e. always close the accepted fd first.  I think a comment
> > in the tcpbpf_user.c is good enough for now.
> 
> Okay, I can add a comment explaining this.
> 
> > [ It does not have to be in this set and it can be done in another
> >   follow up effort.
> >   Instead of using a bpf map to store the result, using global
> >   variables in test_tcpbpf_kern.c will simplify the code further. ]
> 
> I assume this comment is about the changes to test_tcpbpf_kern.c? Just
> want to clarify as I assume this isn't about adding the comment about
> the socket closing order affecting the bytes_received/acked.
Right, it is unrelated to the "adding the comment about socket closing order".
It is about changing test_tcpbpf_kern.c and tcpbpf_user.c to
use global variables instead of bpf map to store results.
Again, it can be done later.  This can be used as an example:
b18c1f0aa477 ("bpf: selftest: Adapt sock_fields test to use skel and global variables")