lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-LecYGeH1Wg3kL-pDDKaFX0pbXzr_dfVPrsrBL3uL8VWw@mail.gmail.com>
Date:   Wed, 21 Feb 2018 16:54:56 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Sowmini Varadhan <sowmini.varadhan@...cle.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>, rds-devel@....oracle.com,
        Santosh Shilimkar <santosh.shilimkar@...cle.com>
Subject: Re: [PATCH net-next] RDS: deliver zerocopy completion notification
 with data as an optimization

On Wed, Feb 21, 2018 at 3:19 PM, Sowmini Varadhan
<sowmini.varadhan@...cle.com> wrote:
> This commit is an optimization that builds on top of commit 01883eda72bd
> ("rds: support for zcopy completion notification") for PF_RDS sockets.
>
> Cookies associated with zerocopy completion are passed up on the POLLIN
> channel, piggybacked with data whereever possible. Such cookies are passed
> up as ancillary data (at level SOL_RDS) in a struct rds_zcopy_cookies when
> the returned value of recvmsg() is >= 0. A max of SO_EE_ORIGIN_MAX_ZCOOKIES
> may be passed with each message.

I'd put this optimization behind a socket option.

Either that, or always store cookies on an RDS specific queue, return as recv
cmsg and then remove the alternative error queue path completely.

Having both is confusing, also in terms of system behavior (e.g., see signaling
and sk_err handling in sock_dequeue_err_skb).

> +static int rds_recvmsg_zcookie(struct rds_sock *rs, struct msghdr *msg)
> +{
> +       struct sk_buff *skb, *tmp;
> +       struct sock_exterr_skb *serr;
> +       struct sock *sk = rds_rs_to_sk(rs);
> +       struct sk_buff_head *q = &sk->sk_error_queue;
> +       struct rds_zcopy_cookies done;
> +       u32 *ptr;
> +       int i;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&q->lock, flags);

This seems expensive on every recvmsg. Even if zerocopy is not enabled.

> +       if (skb_queue_empty(q)) {
> +               spin_unlock_irqrestore(&q->lock, flags);
> +               return 0;
> +       }
> +       skb_queue_walk_safe(q, skb, tmp) {

This too. If anything, just peek at the queue head and skip otherwise.
Anything else will cause an error, which blocks regular read and write
on the socket.

>  int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
>                 int msg_flags)
>  {
> @@ -586,6 +623,7 @@ int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
>         int ret = 0, nonblock = msg_flags & MSG_DONTWAIT;
>         DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name);
>         struct rds_incoming *inc = NULL;
> +       int ncookies;
>
>         /* udp_recvmsg()->sock_recvtimeo() gets away without locking too.. */
>         timeo = sock_rcvtimeo(sk, nonblock);
> @@ -609,6 +647,14 @@ int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
>                         break;
>                 }
>
> +               if (list_empty(&rs->rs_recv_queue) && nonblock) {
> +                       ncookies = rds_recvmsg_zcookie(rs, msg);
> +                       if (ncookies) {
> +                               ret = 0;

This probably changes semantics for MSG_DONTWAIT. Will it ever return 0 now?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ