lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 20 May 2021 13:00:34 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     John Fastabend <john.fastabend@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Cong Wang <cong.wang@...edance.com>,
        Jiang Wang <jiang.wang@...edance.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Jakub Sitnicki <jakub@...udflare.com>,
        Lorenz Bauer <lmb@...udflare.com>
Subject: Re: [Patch bpf] selftests/bpf: Retry for EAGAIN in udp_redir_to_connected()

On Thu, May 20, 2021 at 10:47 AM John Fastabend
<john.fastabend@...il.com> wrote:
>
> Cong Wang wrote:
> > On Wed, May 19, 2021 at 2:56 PM John Fastabend <john.fastabend@...il.com> wrote:
> > >
> > > Cong Wang wrote:
> > > > From: Cong Wang <cong.wang@...edance.com>
> > > >
> > > > We use non-blocking sockets for testing sockmap redirections,
> > > > and got some random EAGAIN errors from UDP tests.
> > > >
> > > > There is no guarantee the packet would be immediately available
> > > > to receive as soon as it is sent out, even on the local host.
> > > > For UDP, this is especially true because it does not lock the
> > > > sock during BH (unlike the TCP path). This is probably why we
> > > > only saw this error in UDP cases.
> > > >
> > > > No matter how hard we try to make the queue empty check accurate,
> > > > it is always possible for recvmsg() to beat ->sk_data_ready().
> > > > Therefore, we should just retry in case of EAGAIN.
> > > >
> > > > Fixes: d6378af615275 ("selftests/bpf: Add a test case for udp sockmap")
> > > > Reported-by: Jiang Wang <jiang.wang@...edance.com>
> > > > Cc: John Fastabend <john.fastabend@...il.com>
> > > > Cc: Daniel Borkmann <daniel@...earbox.net>
> > > > Cc: Jakub Sitnicki <jakub@...udflare.com>
> > > > Cc: Lorenz Bauer <lmb@...udflare.com>
> > > > Signed-off-by: Cong Wang <cong.wang@...edance.com>
> > > > ---
> > > >  tools/testing/selftests/bpf/prog_tests/sockmap_listen.c | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> > > > index 648d9ae898d2..b1ed182c4720 100644
> > > > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> > > > @@ -1686,9 +1686,13 @@ static void udp_redir_to_connected(int family, int sotype, int sock_mapfd,
> > > >       if (pass != 1)
> > > >               FAIL("%s: want pass count 1, have %d", log_prefix, pass);
> > > >
> > > > +again:
> > > >       n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1);
> > > > -     if (n < 0)
> > > > +     if (n < 0) {
> > > > +             if (errno == EAGAIN)
> > > > +                     goto again;
> > > >               FAIL_ERRNO("%s: read", log_prefix);
> > >
> > > Needs a counter and abort logic we don't want to loop forever in the
> > > case the packet is lost.
> >
> > It should not be lost because selftests must be self-contained,
> > if the selftests could not even predict whether its own packet is
> > lost or not, we would have a much bigger trouble than just this
> > infinite loop.
> >
> > Thanks.
>
> Add the retry counter its maybe 4 lines of code. When I run this in a container

Sure, then the next question is how many times are you going to retry?
Let's say, 10. Then the next question would be is 10 sufficient? Clearly
not, because if the packet can be dropped (let's say by firewall), it can
be delayed too (let's say by netem).

Really, are we going to handle all of such cases? Or if we simply
assume the environment is self-contained and not hostile, none of
the above would happen hence nothing needs to be checked.

> and my memcg kicks in for some unknown reason I don't want it to loop
> forever I want it to fail gracefully. Plus it just looks bad to encode
> a non-terminating loop, in my opinion, so I want the counter.

There could be hundreds of reasons to drop a packet, or delay a
packet. How many of them do you want to consider and how many
of them do you not consider? Please draw a boundary.

The boundary I drew is very clear: we just assume the selftests
environment is not hostile. In fact, I believe selftests should setup
such an environment before running any test, for example, by creating
a container. And I believe this would make everyone happy.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ