lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 30 May 2023 11:34:32 -0700
From: John Fastabend <john.fastabend@...il.com>
To: Eric Dumazet <edumazet@...gle.com>, 
 John Fastabend <john.fastabend@...il.com>
Cc: jakub@...udflare.com, 
 daniel@...earbox.net, 
 bpf@...r.kernel.org, 
 netdev@...r.kernel.org, 
 ast@...nel.org, 
 andrii@...nel.org, 
 will@...valent.com
Subject: Re: [PATCH bpf v10 07/14] bpf: sockmap, wake up polling after data
 copy

Eric Dumazet wrote:
> On Tue, May 23, 2023 at 4:56 AM John Fastabend <john.fastabend@...il.com> wrote:
> >
> > When TCP stack has data ready to read sk_data_ready() is called. Sockmap
> > overwrites this with its own handler to call into BPF verdict program.
> > But, the original TCP socket had sock_def_readable that would additionally
> > wake up any user space waiters with sk_wake_async().
> >
> > Sockmap saved the callback when the socket was created so call the saved
> > data ready callback and then we can wake up any epoll() logic waiting
> > on the read.
> >
> > Note we call on 'copied >= 0' to account for returning 0 when a FIN is
> > received because we need to wake up user for this as well so they
> > can do the recvmsg() -> 0 and detect the shutdown.
> >
> > Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
> > Reviewed-by: Jakub Sitnicki <jakub@...udflare.com>
> > Signed-off-by: John Fastabend <john.fastabend@...il.com>
> > ---
> >  net/core/skmsg.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > index bcd45a99a3db..08be5f409fb8 100644
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
> > @@ -1199,12 +1199,21 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
> >  static void sk_psock_verdict_data_ready(struct sock *sk)
> >  {
> >         struct socket *sock = sk->sk_socket;
> > +       int copied;
> >
> >         trace_sk_data_ready(sk);
> >
> >         if (unlikely(!sock || !sock->ops || !sock->ops->read_skb))
> >                 return;
> > -       sock->ops->read_skb(sk, sk_psock_verdict_recv);
> > +       copied = sock->ops->read_skb(sk, sk_psock_verdict_recv);
> > +       if (copied >= 0) {
> > +               struct sk_psock *psock;
> > +
> > +               rcu_read_lock();
> > +               psock = sk_psock(sk);
> > +               psock->saved_data_ready(sk);
> > +               rcu_read_unlock();
> > +       }
> >  }
> >
> >  void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
> > --
> > 2.33.0
> >
> 
> It seems psock could be NULL here, right ?
> 
> What do you think if I submit the following fix ?
> 
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index a9060e1f0e4378fa47cfd375b4729b5b0a9f54ec..a29508e1ff3568583263b9307f7b1a0e814ba76d
> 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -1210,7 +1210,8 @@ static void sk_psock_verdict_data_ready(struct sock *sk)
> 
>                 rcu_read_lock();
>                 psock = sk_psock(sk);
> -               psock->saved_data_ready(sk);
> +               if (psock)
> +                       psock->saved_data_ready(sk);
>                 rcu_read_unlock();
>         }
>  }

Yes please do presumably this is plausible if user delete map entry while
data is being sent and we get a race. We don't have any tests for this
in our CI though because we never delete socks after adding them and
rely on the sock close. This shouldn't happen in that path because of the
data_ready is blocked on SOCK_DEAD flag iirc.

I'll think if we can add some stress test to add map update/delete in
a tight loop with live socket sending/receiving traffic.

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ