netdev - Re: [PATCH net-next] RDS: deliver zerocopy completion notification with data as an optimization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180221221428.GG15244@oracle.com>
Date:   Wed, 21 Feb 2018 17:14:28 -0500
From:   Sowmini Varadhan <sowmini.varadhan@...cle.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>, rds-devel@....oracle.com,
        Santosh Shilimkar <santosh.shilimkar@...cle.com>
Subject: Re: [PATCH net-next] RDS: deliver zerocopy completion notification
 with data as an optimization

On (02/21/18 16:54), Willem de Bruijn wrote:
> 
> I'd put this optimization behind a socket option.

Yes, that thought occurred to me as well- I think RDS applications
are unlikely to use the error_queue path because, as I mentioned
before, these are heavily request-response based, so you're 
going to be getting something back for each sendmsg anyway.

So if I had a sockopt, it would be something that would be
"piggyback completion" most (all?) of the time anyway, but I suppose
making it explicit is useful to have.

> Either that, or always store cookies on an RDS specific queue, return as recv
> cmsg and then remove the alternative error queue path completely.

I think the error queue path is good to have, if only to be aligned
with other socket families.

> > +       spin_lock_irqsave(&q->lock, flags);
> 
> This seems expensive on every recvmsg. Even if zerocopy is not enabled.

noted, I'll predicate it on both zcopy and the sockopt suggestion.

> > +       if (skb_queue_empty(q)) {
> > +               spin_unlock_irqrestore(&q->lock, flags);
> > +               return 0;
> > +       }
> > +       skb_queue_walk_safe(q, skb, tmp) {
> 
> This too. If anything, just peek at the queue head and skip otherwise.
> Anything else will cause an error, which blocks regular read and write
> on the socket.

but that would be technically incorrect, because you could have a 
mix/match of the zcopy notification with other sk_error_queue messages
(given that rds_rm_zerocopy_callback looks at the tail, and
creates a new skb if the tail is not a SO_EE_ORIGIN_ZCOOKIE).

Of course, its true that *today* the only thing in the rds socket
error queue is the cookie list, so this queue walk is a bit of overkill..

but maybe I am missing something you are concerned about? The queue walk is
intended to pull out the first SO_EE_ORIGIN_ZCOOKIE (if it exists)
and return quietly otherwise (leaving err_queue unchanged)- where
do you see the blocking error?

> > +               if (list_empty(&rs->rs_recv_queue) && nonblock) {
> > +                       ncookies = rds_recvmsg_zcookie(rs, msg);
> > +                       if (ncookies) {
> > +                               ret = 0;
> 
> This probably changes semantics for MSG_DONTWAIT. Will it ever return 0 now?

The only time MSG_DONTWAIT was previously returning 0 for PF_RDS
was when there was a congestion notification or rdma completion
i.e., the checks for rs_notify_queue and rs_cong_notify earlier
in the function, and in those 2 cases it was not returning data
(i.e. ret was always 0). In all other cases  rds_recvmsg would
always return either
- ret == -1, with errno set to EAGAIN or ETIMEDOUT, depending on the value
  of MSG_DONTWAIT, or,
- ret > 0 with data bytes.

That behavior (as well existing behavior for congestion notification
and rdma completion) has not changed. The only thing that changed is that
if there is no data to pass up, the ret will be 0, errno will be 0,
and the ancillary data may have completion cookies.

--Sowmini